Jiaqi Xue

I am a 3rd-year Ph.D. student at the Computer Science Department at University of Central Florida, advised by Prof. Qian Lou. Before that, I obtained my Bachelor’s degree at Chongqing University in 2022.

My research interests lie in the field of machine learning security, particularly in trojan attack/defense for AI models and AI Privacy Protection. Reach out to me over email: jiaqi.xue@ucf.edu

CV  /  Google Scholar  /  Linkedin  /  Github

profile photo

News

  • [Sep. 2024] One paper accepted to EMNLP 2024.
  • [Sep. 2024] One paper accepted to IEEE S&P (Oakland) 2025.
  • [Aug. 2024] Two paper accepted to CCS-LAMPS 2024.
  • [Jul. 2024] One paper accepted to ECCV 2024.
  • [Jun. 2024] One paper accepted to PACT 2024.
  • [May. 2024] I joined Samsung Research America as a research intern.
  • [May. 2024] One paper accepted to ACL 2024.
  • [Mar. 2024] One paper accepted to NAACL 2024 (5.3% oral presentation acceptance rate).
  • [Oct. 2023] Received NeurIPS 2023 Scholar Award.
  • [Sep. 2023] One paper accepted to NeurIPS 2023.
  • [Jan. 2023] I joined UCF as a Ph.D. student.
  • [Jun. 2022] I received B.S. from College of Computer Science, Chongqing University. GPA: 3.82/4.0 (top 4%).

Selected Publications

(*: Equal contribution)

DataSeal: Ensuring the Verifiability of Private Computation on Encrypted Data
Muhammad Husni Santriaji, Jiaqi Xue, Yancheng Zhang, Qian Lou and Yan Solihin
IEEE S&P Oakland, 2025
pdf

DataSeal enhances the verifiability and integrity of private computation in Fully Homomorphic Encryption (FHE) by incorporating Algorithm-based Fault Tolerance (ABFT).

BadFair: Backdoored Fairness Attacks with Group-conditioned Triggers
Jiaqi Xue, Qian Lou, Mengxin Zheng
EMNLP, 2024
pdf

BadFair is a novel, model-agnostic backdoored fairness attack allowing a model to appear fair and accurate on clean inputs while exhibiting discriminatory behavior for specific groups under tainted inputs. It is robust against traditional bias and backdoor detection, achieving an 88.7% attack success rate for the target group with only a 1.2% accuracy drop across tasks.

SSL-Cleanse: Trojan Detection and Mitigation in Self-Supervised Learning
Mengxin Zheng*, Jiaqi Xue*, Zihao Wang, Xun Chen, Qian Lou, Lei Jiang, Xiaofeng Wang
ECCV, 2024
pdf

SSL-Cleanse is a novel work to detect and mitigate Trojan attacks in SSL encoders without accessing any downstream labels. We evaluated SSL-Cleanse on various datasets using 1200 models, achieving an average detection success rate of 82.2% on ImageNet-100. After mitigating backdoors, on average, backdoored encoders achieve 0.3% attack success rate without great accuracy loss.

CR-UTP: Certified Robustness against Universal Text Perturbations
Qian Lou, Xin Liang*, Jiaqi Xue*, Yancheng Zhang, Rui Xie, Mengxin Zheng
ACL, 2024  
pdf

CR-UTP addresses the challenge of certifying language model robustness against Universal Text Perturbations (UTPs) and input-specific text perturbations (ISTPs). We introduce the superior prompt search method and the superior prompt ensembling technique to enhance certified accuracy against UTPs and ISTPs.

TrojFSP: Trojan Insertion in Few-shot Prompt Tuning
Mengxin Zheng, Jiaqi Xue, Xun Chen, YanShan Wang, Qian Lou, Lei Jiang
NAACL, 2024   (Oral Presentation)
pdf / code

TrojFSP addresses the issue of few-shot backdoor attacks, wherein a limited number of token prompts are injected to achieve the backdoor attack objective while maintaining fixed training parameters for the Pre-trained Language Model (PLM).

TrojLLM: A Black-box Trojan Prompt Attack on Large Language Models
Jiaqi Xue, Mengxin Zheng, Ting Hua, Yilin Shen, Yepeng Liu, Ladislau Bölöni, Qian Lou
NeurIPS, 2023
pdf / code / slides / poster

A novel framework for exploring the security vulnerabilities of LLMs, increasingly employed in various tech applications. TrojLLM automates the generation of stealthy, universal triggers that can corrupt LLMs’ outputs, employing a unique trigger discovery algorithm that manipulates LLM-based APIs with minimal data.

Teaching Experience

  • [Jan. 2025 - May 2025] Teaching Assistant for CAP6614 - Current Topics In Machine Learning
  • [Sep. 2024 - Dec. 2024] Teaching Assistant for CDA5106 - Advanced Computer Architecture
  • [Jan. 2024 - May 2024] Teaching Assistant for CAP6614 - Current Topics In Machine Learning
  • [Sep. 2023 - Dec. 2023] Teaching Assistant for CDA5106 - Advanced Computer Architecture
  • [May. 2023 - Aug. 2023] Teaching Assistant for CDA3103 - Computer Logic and Organization

Work Experience

  • [May. 2024 - Aug. 2024] AI Research Intern, Samsung Research America
  • [Mar. 2022 - Jun. 2022] Machine Learning Intern, Kuaishou Y-tech Lab

Service

Reviewer

  • International Joint Conference on Artificial Intelligence (IJCAI)
  • Neural Information Processing Systems (NeurIPS)
  • International Conference on Learning Representations (ICLR)
  • Computer Vision and Pattern Recognition (CVPR)
  • Artificial Intelligence and Statistics (AISTATS)
  • Annual Meeting of the Association for Computational Linguistics (ACL)

Preprints

(*: Equal contribution)

BadRAG: Identifying Vulnerabilities in Retrieval Augmented Generation of Large Language Models
Jiaqi Xue, Mengxin Zheng, Yebowen Hu, Fei Liu, Xun Chen, Qian Lou
Under Review
pdf

This paper introduces BadRAG, a novel framework targeting security vulnerabilities in RAG’s retrieval and generative phases. Utilizing contrastive optimization, BadRAG generates adversarial passages activated only by specific triggers. We also explore leveraging LLM alignment to conduct denial-of-service and sentiment steering attacks.

Audit and Improve Robustness of Private Neural Networks on Encrypted Data
Jiaqi Xue, Lei Xu, Lin Chen, Weidong Shi, Kaidi Xu, Qian Lou
Under Review
pdf

Performing neural network inference on encrypted data without decryption is one popular method to enable privacy-preserving neural networks (PNet) as a service. Compared with regular neural networks deployed for machine-learning-as-a-service, PNet requires additional encoding, e.g., quantized-precision numbers, and polynomial activation. Encrypted input also introduces novel challenges such as adversarial robustness and security. To the best of our knowledge, we are the first to study questions including (i) Whether PNet is more robust against adversarial inputs than regular neural networks? (ii) How to design a robust PNet given the encrypted input without decryption?