Publications

Selected & Complete

Citation counts available on Google Scholar. * denotes equal contribution.

Conference

2026

ProbeLLM: Automating Principled Diagnosis of LLM Failures

Yue Huang*, Zhengzhe Jiang*, Yuchen Ma*, Yu Jiang, Xiangqi Wang, Yujun Zhou, Yuexing Hao, Kehan Guo, Pin-Yu Chen, Marzyeh Ghassemi, Stefan Feuerriegel, Xiangliang Zhang

The Forty-third International Conference on Machine Learning (ICML 2026)

[Code] [Paper]

Capability-Oriented Training Induced Alignment Risk

Yujun Zhou*, Yue Huang*, Han Bao*, Kehan Guo, Zhenwen Liang, Pin-Yu Chen, Tian Gao, Werner Geyer, Nuno Moniz, Nitesh V Chawla, Xiangliang Zhang

The Forty-third International Conference on Machine Learning (ICML 2026)

Position: Beyond Prediction: Toward Verifiable Physiological Waveform Reasoning with Foundation Models and Agentic LLMs

Xiaoda Wang, Ching Chang, Defu Cao, Kaiqiao Han, Fang Sun, Yue Huang, Minxiao Wang, Chang Xu, Xiao Luo, Runze Yan, Xiangliang Zhang, Xiao Hu, Yan Liu, Yizhou Sun, Wei Wang, Carl Yang

The Forty-third International Conference on Machine Learning (ICML 2026)

RiskLab: A Controlled Toolkit for Probing Emergent Risks in LLM-Based Multi-Agent Systems

Yu Jiang, Wenjie Wang, Yue Huang, Yanbo Wang, Zhenhong Zhou, Xiuying Chen, Yang Liu, Pin-Yu Chen, Wei Wang, Xiangliang Zhang

The 64th Annual Meeting of the Association for Computational Linguistics — System Demonstrations (ACL 2026 Demo)

[Code] [Docs.] [Blog]

AutoDavis: Automatic and Dynamic Evaluation Protocol of Large Vision-Language Models on Visual Question-Answering

Han Bao, Yue Huang, Yanbo Wang, Jiayi Ye, Xiangqi Wang, Xiuying Chen, Yue Zhao, Tianyi Zhou, Mohamed Elhoseiny, Xiangliang Zhang

The 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

Synthetic Interaction Data for Scalable Personalization in Large Language Models

Yuchen Ma, Yue Huang, Wenjie Wang, Xiaonan Luo, Xiangliang Zhang, Stefan Feuerriegel

The 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

[Code] [Docs.] [Paper]

Recipes for Agents: Understanding Skills and Their Open Questions

Hanwen Xing, Haomin Zhuang, Xuandong Zhao, Yue Huang, Zhenheng Tang, Xiangliang Zhang

The 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026, Blue Sky)

TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models

Yue Huang, Chujie Gao, Siyuan Wu, Haoran Wang, Xiangqi Wang, Jiayi Ye, Yujun Zhou, Yanbo Wang, et al.

The Fourteenth International Conference on Learning Representations (ICLR 2026)

[Code] [Website] [Docs.]

(Featured by the Hoover Institution Technology Policy Accelerator and the Notre Dame-IBM Technology Ethics Lab)

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Yue Huang, Hang Hua, Yujun Zhou, Pengcheng Jing, Manish Nagireddy, Inkit Padhi, Greta Dolcetti, Zhangchen Xu, Subhajit Chaudhury, Ambrish Rawat, Liubov Nedoshivina, Pin-Yu Chen, Prasanna Sattigeri, Xiangliang Zhang

The Fourteenth International Conference on Learning Representations (ICLR 2026)

[Code] [Model] [Paper]

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Dawei Li, Renliang Sun, Yue Huang, Ming Zhong, Bohan Jiang, Jiawei Han, Xiangliang Zhang, Wei Wang, Huan Liu

The Fourteenth International Conference on Learning Representations (ICLR 2026)

RankLLM: Weighted Ranking of LLMs by Quantifying Question Difficulty

Xingjian Hu*, Ziqian Zhang*, Yue Huang*, Kai Zhang, Ruoxi Chen, Yixin Liu, Qingsong Wen, Kaidi Xu, Xiangliang Zhang, Neil Zhenqiang Gong, Lichao Sun

The Fourteenth International Conference on Learning Representations (ICLR 2026)

SPA: Achieving Consensus in LLM Alignment via Self-Priority Optimization

Yue Huang, Xiangqi Wang, Xiangliang Zhang

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026 Oral)

RMO: Towards Better LLM Alignment via Reshaping Reward Margin Distributions

Yanchi Ru*, Yue Huang*, Xiangliang Zhang

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026)

Better Datasets Start From RefineLab: Automatic Optimization for High-Quality Dataset Refinement

Xiaonan Luo*, Yue Huang*, Ping He, Xiangliang Zhang

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026)

SECURE: Safety Enforcement Constraint Using Regularized Orthogonality for LLM Fine-Tuning

Shuo Yang, Qihui Zhang, Yuyang Liu, Yue Huang, et al.

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026 Oral)

PolicyLLM: Towards Excellent Comprehension of Public Policy for Large Language Models

Han Bao, Penghao Zhang, Yue Huang, Zhengqing Yuan, Yanchi Ru, SU RUI, Yujun Zhou, Xiangqi Wang, Kehan Guo, Nitesh V Chawla, Yanfang Ye, Xiangliang Zhang

Findings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026 Findings)

Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs

Yue Huang, Haomin Zhuang, Jiayi Ye, Han Bao, Yanbo Wang, Hang Hua, Siyuan Wu, Pin-Yu Chen, Xiangliang Zhang

Findings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026 Findings)

Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?

Yichen Feng, Yuetai Li, Chunjiang Liu, Fengqing Jiang, Yue Huang, Yuanyuan Chen, Hang Hua, Zhengqing Yuan, Kaiyuan Zheng, Luyao Niu, Bhaskar Ramasubramanian, Basel Alomair, Xiangliang Zhang, Misha Sra, Zichen Chen, Radha Poovendran, Zhangchen Xu

Third Conference on Language Modeling (COLM 2026)

[Paper] [Website] [Code]

My Favorite Streamer is an LLM: Discovering, Bonding, and Co-Creating in AI VTuber Fandom

Jiayi Ye, Chaoran Chen, Yue Huang, Yanfang Ye, Toby Jia-Jun Li, Xiangliang Zhang

2026 CHI Conference on Human Factors in Computing Systems (ACM CHI 2026, Honourable Mention Award)

(Cited by Wikipedia and featured in a high-impact Bilibili video with 247K+ views, 10.6K likes, and 6.3K saves)

IntraAI: Bridging Human-AI Understanding Through Intelligent Interface Design

Han Bao, Yue Huang, Xiangliang Zhang, Yanfang Ye

Demonstrations of The Web Conference 2026 (WWW 2026 Demo)

2025

ChemOrch: Empowering LLMs with Chemical Intelligence via Groundbreaking Synthetic Instructions

Yue Huang*, Zhengzhe Jiang*, Xiaonan Luo, Kehan Guo, Haomin Zhuang, Yujun Zhou, Zhengqing Yuan, Xiaoqi Sun, Jules Schleinitz, Yanbo Wang, Shuhao Zhang, Mihir Surve, Nitesh V Chawla, Olaf Wiest, Xiangliang Zhang

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

[Code]

(Highlighted in the semi-annual meeting hosted by NSF Center for Computer-Assisted Synthesis (C-CAS))

AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking

Xiangqi Wang*, Yue Huang*, Yanbo Wang, Xiaonan Luo, Kehan Guo, Yujun Zhou, Xiangliang Zhang

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

[Code]

EmoNest: Crafting Emotionally-Adaptive Experiences with Generative AI

Yue Huang, Siyuan Wu

The Thirty-ninth Annual Conference on Neural Information Processing Systems Creative AI Track: Humanity (NeurIPS 2025 Creative AI Track)

DyFlow: Dynamic Workflow Framework for Agentic Reasoning

Yanbo Wang, Zixiang Xu, Yue Huang, Xiangqi Wang, Zirui Song, Lang Gao, Chenxi Wang, Xiangru Tang, Yue Zhao, Arman Cohan, Xiangliang Zhang, Xiuying Chen

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

[Code] [Paper]

Adaptive Distraction: Probing LLM Contextual Robustness with Automated Tree Search

Yanbo Wang*, Zixiang Xu*, Yue Huang*, Chujie Gao, Siyuan Wu, Jiayi Ye, Pin-Yu Chen, Xiuying Chen, Xiangliang Zhang

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

[Code]

Jailbreaking Large Language Models Through Alignment Vulnerabilities in Out-of-Distribution Settings

Yue Huang, Jingyu Tang, Dongping Chen, Bingda Tang, Yao Wan, Lichao Sun, Philip S. Yu, Xiangliang Zhang

34th ACM International Conference on Information and Knowledge Management (CIKM 2025)

Exposing and Patching the Flaws of Large Language Models in Social Character Simulation

Yue Huang*, Zhengqing Yuan*, Yujun Zhou, Kehan Guo, Xiangqi Wang, Haomin Zhuang, Weixiang Sun, Lichao Sun, Jindong Wang, Yanfang Ye, Xiangliang Zhang

Second Conference on Language Modeling (COLM 2025)

Think it Image by Image: Multi-Image Moral Reasoning of Large Vision-Language Models

Chujie Gao, Yue Huang, Xiangqi Wang, Siyuan Wu, Nitesh Chawla, Xiangliang Zhang

34th ACM International Conference on Information and Knowledge Management (CIKM 2025)

Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study

Yujun Zhou, Jiayi Ye, Zipeng Ling, Yufei Han, Yue Huang, Haomin Zhuang, Zhenwen Liang, Kehan Guo, Taicheng Guo, Xiangqi Wang, Xiangliang Zhang

The 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025)

DataGen: A Unified Framework for Textual Dataset Generation Using Large Language Models

Yue Huang*, Siyuan Wu*, Chujie Gao, Dongping Chen, Qihui Zhang, Yao Wan, Tianyi Zhou, Xiangliang Zhang, Jianfeng Gao, Chaowei Xiao, Lichao Sun

The Thirteenth International Conference on Learning Representations (ICLR 2025)

[Toolkit] [Website] [Tutorial Video]

Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge

Jiayi Ye*, Yanbo Wang*, Yue Huang*, Dongping Chen, Qihui Zhang, Nuno Moniz, Tian Gao, Werner Geyer, Chao Huang, Pin-Yu Chen, Nitesh V Chawla, Xiangliang Zhang

The Thirteenth International Conference on Learning Representations (ICLR 2025)

[Website] [Coverage] [Tutorial Video]

(Highlighted by the Notre Dame-IBM Technology Ethics Lab, with a related tutorial video)

GUI-World: A Dataset Towards GUI-Orientated Multimodal Large Language Models

Dongping Chen*, Yue Huang*, Siyuan Wu*, Jingyu Tang*, Huichi Zhou, Qihui Zhang, Zhigang He, Yilin Bai, Chujie Gao, et.al.

The Thirteenth International Conference on Learning Representations (ICLR 2025)

[Website] [Dataset] [Model]

Artificial Intelligence in Spectroscopy: Advancing Chemistry from Prediction to Generation and Beyond

Kehan Guo, Yili Shen, Gisela Abigail Gonzalez-Montiel, Yue Huang, Yujun Zhou, Mihir Surve, Zhichun Guo, Prayel Das, Nitesh V Chawla, Olaf Wiest, Xiangliang Zhang

2025 International Joint Conference on Artificial Intelligence (IJCAI 2025)

[Github]

Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment

Dongping Chen, Ruoxi Chen, Shu Pu, Zhaoyi Liu, Yanru Wu, Caixi Chen, Benlin Liu, Yue Huang, Yao Wan, Pan Zhou, Ranjay Krishna

The Thirteenth International Conference on Learning Representations (ICLR 2025)

[Website] [Code]

Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models

Zixiang Xu*, Yanbo Wang*, Yue Huang*, Xiuying Chen, Jieyu Zhao, Meng Jiang, Xiangliang Zhang

The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025)

Beyond Single-Value Metrics: Evaluating and Enhancing LLM Unlearning with Cognitive Diagnosis

Yicheng Lang*, Kehan Guo*, Yue Huang, Yujun Zhou, Haomin Zhuang, Tianyu Yang, Yao Su, Xiangliang Zhang

Findings of The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025 Findings)

UPME: An Unsupervised Peer Review Framework for Multimodal Large Language Model Evaluation

Qihui Zhang, Munan Ning, Zheyuan Liu, Yanbo Wang, Jiayi Ye, Yue Huang, Shuo Yang, Xiao Chen, Yibing Song, Li Yuan

The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025 (CVPR 2025)

TRUSTEVAL: A Dynamic Evaluation Toolkit on Trustworthiness of Generative Foundation Models

Yanbo Wang*, Jiayi Ye*, Siyuan Wu*, Chujie Gao, Yue Huang, Xiuying Chen, Yue Zhao, Xiangliang Zhang

2025 Annual Conference of the North American Chapter of the Association for Computational Linguistics -- System Demonstration (NAACL 2025 Demo)

[Code] [Tutorial Video] [Docs.]

2024

TrustLLM: Trustworthiness in Large Language Models

Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, et al.

2024 International Conference on Machine Learning (ICML 2024)

(Widely recognized by leading U.S. and international AI safety communities: highlighted by United States Department of Homeland Security (DHS) & International AI Safety Report; listed in the U.S. DoD CDAO Generative AI Responsible AI Toolkit; featured in NIST AI 100-2e2025 Section 3.6 "Benchmarks for AML Vulnerabilities"; adopted as an official benchmark in all three editions of the FLI AI Safety Index; Invited Talk at IBM Research; 10,000 times of toolkit download)

[Code&Toolkit] [Website] [Dataset] [Docs.]

MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use

Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu, Qihui Zhang, Yixin Liu, Pan Zhou, et al.

The Twelfth International Conference on Learning Representations (ICLR 2024)

[Code]

1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators?

Yue Huang, Chenrui Fan, Yuan Li, Siyuan Wu, Yao Wan, Tianyi Zhou, Xiangliang Zhang, Lichao Sun

The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024)

HonestLLM: Toward an Honest and Helpful Large Language Model

Chujie Gao*, Siyuan Wu*, Yue Huang*, Dongping Chen*, Qihui Zhang*, et al.

Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024)

[Code]

Optimization-based Prompt Injection Attack to LLM-as-a-Judge

Jiawen Shi, Zenghui Yuan, Yinuo Liu, Yue Huang, Pan Zhou, Lichao Sun, Neil Zhenqiang Gong

The ACM Conference on Computer and Communications Security (ACM CCS 2024)

[Code]

LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?

Qihui Zhang*, Chujie Gao*, Dongping Chen*, Yue Huang, et al.

2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Findings of NAACL 2024)

[Code] [Website]

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Xiao Liu*, Xuanyu Lei*, Shengyuan Wang, Yue Huang, Zhuoer Feng, et al.

The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024)

(Adopted by leading Chinese LLM developers for official Chinese-alignment evaluation, including Alibaba Qwen, DeepSeek, Zhipu AI (ChatGLM), 01.AI (Yi), Baichuan, and MiniMax (Abab); also used in China Telecom Tele-FLM evaluation and the Tsinghua SuperBench suite)

[Code] [Website]

From Creation to Clarification: ChatGPT's Journey Through the Fake News Quagmire

Yue Huang, Kai Shu, Philip S. Yu, Lichao Sun

2024 ACM Web Conference (WWW 2024)

2023

CyberEA: An Efficient Entity Alignment Framework for Cybersecurity Knowledge Graph

Yue Huang, Yongyan Guo, Cheng Huang

EAI International Conference on Security and Privacy in Communication Networks (SecureComm 2023)

Workshop Paper

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Dawei Li, Renliang Sun, Yue Huang, Ming Zhong, Bohan Jiang, Jiawei Han, Xiangliang Zhang, Wei Wang, Huan Liu

DIG-BUG@ICML 2025 (Best Paper Award)

Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models

Yuan Li, Yue Huang, Hongyi Wang, Xiangliang Zhang, James Zou, Lichao Sun

SciSocLLM@KDD 2025 (Best Paper Award)

Tutorial

2026

Towards Trustworthy and Socially Responsible Generative Foundation Models

Yue Huang, Zhenhong Zhou, Pin-Yu Chen, Xiangliang Zhang

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026 Tutorial)

2025

Science of Trustworthy Generative Foundation Models

Yue Huang, Canyu Chen

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025 Tutorial)

Socially Responsible and Trustworthy Generative Foundation Models: Principles, Challenges, and Practices

Yue Huang, Canyu Chen, Lu Cheng, Bhavya Kailkhura, Nitesh Chawla, Xiangliang Zhang

The 34th International Conference on Information and Knowledge Management (CIKM 2025 Tutorial)

Generative Models for Synthetic Data: Transforming Data Mining in the GenAI Era

Dawei Li, Yue Huang, Ming Li, Tianyi Zhou, Xiangliang Zhang, Huan Liu

The 34th International Conference on Information and Knowledge Management (CIKM 2025 Tutorial)

Responsible GenFMs: From Foundational Principles to Real-World Impact

Yue Huang, Canyu Chen, Lu Cheng, Bhavya Kailkhura, Manlin Li, Xiangliang Zhang

2025 IEEE International Conference on Data Mining (ICDM 2025 Tutorial)