Yue Huang | 黄跃

Yue Huang | 黄跃

PhD Student of Computer and Science · University of Notre Dame

Daily life

About Me

I’m a second-year PhD student in MINE Lab of Computer Science and Engineering (CSE) at the University of Notre Dame began from Fall 2024, supervised by Leonard C. Bettex Collegiate Prof. Xiangliang Zhang. I’m also a graduate student in Foundation Models and Applications Lab (FMAL) at Lucy Family Institute for Data & Society. I obtained my bachelor’s degree from Sichuan University in 2024. In 2025 summer, I worked with Prasanna Sattigeri and Pin-Yu Chen at MIT-IBM Watson AI Lab and IBM Research AI. Previously, I was a visiting student under the guidance of Prof. Lichao Sun. This experience was enhanced by mentorship from Prof. Philip S. Yu. Earlier before, I worked under Prof. Tang Jie and Dr. Xiao Liu at Tsinghua University.

  • I welcome the opportunity to connect with colleagues in my field as well as those from interdisciplinary areas.
  • My recent research centers on: (1) the science of foundation models, with a particular emphasis on their trustworthiness; (2) edge or tailed alignment of foundation models; (3) dynamic evaluation protocols tailored for generative models; and (4) challenging the capability boundary of frontier models (e.g., VAB).
  • If you are interested in my research, please contact me via email or for an in-person conversation.

Research Interests

My research is centered on three pivotal questions:

Trustworthy, Aligned, and Democratically Governed Generative Foundation Models. This line of inquiry seeks to develop robust frameworks for evaluating trustworthiness and to identify strategies for enhancing the trustworthiness of these models within specific application domains. This includes: ICML'24, NAACL'24, ACM CCS'24, WWW'24, EMNLP'24, NeurIPS'24, ICLR'25d, and NeurIPS'25a.
Data-Driven Scalable Alignment for General-Purpose AI Systems. This research emphasizes data-centric methods to enable scalable model alignment and evolution, ensuring that they adhere to human values and ethical paradigms throughout the development process. This includes: ACL'24, EMNLP'24, ICLR'25a, ICLR'25b, AAAI'26a, AAAI'26b, and ICLR'26b.
Scientific AI and Societal AI. This research area critically assesses the practical impact of generative models, with a particular focus on their application and AI4Science, exploring its transformative potential and interdisciplinary contributions in fields such as agentic models, social sciences, and beyond. This includes: ICLR'24, ICLR'25c, NeurIPS'25b, and New Blog.

News

Call for Papers
RelSciFM @ KDD 2026 — Workshop on Reliable Scientific Foundation Models: Design, Training, Grounding, and Verification
Jeju, Korea  ·  August 2026  ·  Submission Deadline: April 30, 2026
Submit →
Apr. 2026  Two papers are accepted by ACL 2026 Findings.
Mar. 2026  I will join Microsoft Research as a research scientist intern this summer, see you in Seattle. One paper is accepted by WWW 2026 Demo and one paper is accepted by CHI 2026 (Honourable Mention Award)!
Jan. 2026  Four papers are accepted by ICLR 2026! See you in Rio de Janeiro, Brazil!
Nov. 2025  I was selected as Jetstream2 NAIRR AI Fellow. Four papers are accepted by AAAI 2026 (Priority Alignment (SPA) and SECURE are selected as oral, RefineLab and RMO are selected as poster). Yujun's LabSafetyBench is accepted by Nature Machine Intelligence, congrats!
Oct. 2025  I will present a tutorial of "Science of Trustworthy Generative Foundation Models" in NeurIPS 2025, and we are organizing one workshop at NeurIPS 2025. Hope to see you in Mexico City!
Sep. 2025  We have four papers accepted by NeurIPS 2025 (huge thanks to Xiangqi, Yanbo, and all other co-authors), and another paper (EmoNest) is accepted by NeurIPS 2025 Creative AI track (Try our demo in Dec. at conference). See you in San Diego for our posters!
Aug. 2025  One paper is accepted by EMNLP 2025 Findings and two paper are accepted by CIKM 2025.
July. 2025   Preference Leakage won the best paper award of DIG-BUGs@ICML 2025, PsychometricBench won the best paper award of SciSocLLM@KDD'25. One paper accepted by COLM 2025.
May. 2025   Two papers are accepted by ACL 2025 (1 Main + 1 Findings).
Mar. 2025   TrustEval is accepted by NAACL 2025 Demo and UPME is accepted by CVPR 2025.
Jan. 2025   Four papers have been accepted by ICLR 2025! I was selected as KAUST Rising Stars in AI Symposium 2025 (24/300+).
Dec. 2024   I will join IBM Research as a Research Scientist Intern in 2025 Summer. See you in Cambridge, MA.
Sep. 2024   HonestLLM has been accepted by NeurIPS 2024. Congratulations to Chujie! Another paper has been accepted by main conference of EMNLP 2024.
Aug. 2024   Attack LLM-as-a-Judge has been accepted by ACM CCS 2024. OpenAI's Researcher Access Program is Awarded.
May. 2024   TrustLLM has been accepted by ICML 2024. Another paper has been accepted by main conference of ACL 2024.
Mar. 2024   One paper has been accepted by NAACL 2024. Another paper has been accepted as a short paper of WWW 2024.
Jan. 2024   MetaTool has been accepted by ICLR 2024!

Selected Publications

Disclaimer: This material is presented to ensure the timely dissemination of scholarly works. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms invoked by each author’s copyright.

*: Equal Contribution

Guardian-as-an-Advisor: Advancing Next-Generation Guardian Models for Trustworthy LLMs

Yue Huang, Haomin Zhuang, Jiayi Ye, Han Bao, Yanbo Wang, Hang Hua, Siyuan Wu, Pin-Yu Chen, Xiangliang Zhang

Findings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026 Findings)

TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models

Yue Huang, Chujie Gao, Siyuan Wu, Haoran Wang, Xiangqi Wang, Jiayi Ye, Yujun Zhou, Yanbo Wang, et al.

The Fourteenth International Conference on Learning Representations (ICLR 2026)

(Featured by the Hoover Institution Technology Policy Accelerator and the Notre Dame-IBM Technology Ethics Lab)

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Yue Huang, Hang Hua, Yujun Zhou, Pengcheng Jing, Manish Nagireddy, Inkit Padhi, Greta Dolcetti, Zhangchen Xu, Subhajit Chaudhury, Ambrish Rawat, Liubov Nedoshivina, Pin-Yu Chen, Prasanna Sattigeri, Xiangliang Zhang

The Fourteenth International Conference on Learning Representations (ICLR 2026)

SPA: Achieving Consensus in LLM Alignment via Self-Priority Optimization

Yue Huang, Xiangqi Wang, Xiangliang Zhang

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026 Oral)

RMO: Towards Better LLM Alignment via Reshaping Reward Margin Distributions

Yanchi Ru*, Yue Huang*, Xiangliang Zhang

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026)

Better Datasets Start From RefineLab: Automatic Optimization for High-Quality Dataset Refinement

Xiaonan Luo*, Yue Huang*, Ping He, Xiangliang Zhang

The 40th Annual AAAI Conference on Artificial Intelligence (AAAI 2026)

ChemOrch: Empowering LLMs with Chemical Intelligence via Groundbreaking Synthetic Instructions

Yue Huang*, Zhengzhe Jiang*, Xiaonan Luo, Kehan Guo, Haomin Zhuang, Yujun Zhou, Zhengqing Yuan, Xiaoqi Sun, Jules Schleinitz, Yanbo Wang, Shuhao Zhang, Mihir Surve, Nitesh V Chawla, Olaf Wiest, Xiangliang Zhang

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

(Highlighted in the semi-annual meeting hosted by NSF Center for Computer-Assisted Synthesis (C-CAS))

AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking

Xiangqi Wang*, Yue Huang*, Yanbo Wang, Xiaonan Luo, Kehan Guo, Yujun Zhou, Xiangliang Zhang

The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

Exposing and Patching the Flaws of Large Language Models in Social Character Simulation

Yue Huang*, Zhengqing Yuan*, Yujun Zhou, Kehan Guo, Xiangqi Wang, Haomin Zhuang, Weixiang Sun, Lichao Sun, Jindong Wang, Yanfang Ye, Xiangliang Zhang

Second Conference on Language Modeling (COLM 2025)

DataGen: Unified Synthetic Dataset Generation via Large Language Models

Yue Huang*, Siyuan Wu*, Chujie Gao, Dongping Chen, Qihui Zhang, Yao Wan, Tianyi Zhou, Chaowei Xiao, Jianfeng Gao, et al.

The Thirteenth International Conference on Learning Representations (ICLR 2025)

Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge

Jiayi Ye*, Yanbo Wang*, Yue Huang*, Dongping Chen, Qihui Zhang, Nuno Moniz, Tian Gao, Werner Geyer, Chao Huang, Pin-Yu Chen, Nitesh V Chawla, Xiangliang Zhang

The Thirteenth International Conference on Learning Representations (ICLR 2025)

(Highlighted by the Notre Dame-IBM Technology Ethics Lab, with a related tutorial video)

My Favorite Streamer is an LLM: Discovering, Bonding, and Co-Creating in AI VTuber Fandom

Jiayi Ye, Chaoran Chen, Yue Huang, Yanfang Ye, Toby Jia-Jun Li, Xiangliang Zhang

2026 CHI Conference on Human Factors in Computing Systems (ACM CHI 2026, Honourable Mention Award)

(Cited by Wikipedia and featured in a high-impact Bilibili video with 247K+ views, 10.6K likes, and 6.3K saves)

TrustLLM: Trustworthiness in Large Language Models

Yue Huang, Lichao Sun, Haoran Wang, Siyuan Wu, Qihui Zhang, Chujie Gao, Yixin Huang, Wenhan Lyu, Yixuan Zhang, Xiner Li, Zhengliang Liu, Yixin Liu, Yijue Wang, Zhikun Zhang, Bhavya Kailkhura, Caiming Xiong, et al.

2024 International Conference on Machine Learning (ICML 2024)

(Widely recognized by leading U.S. and international AI safety communities: highlighted by United States Department of Homeland Security (DHS) & International AI Safety Report; listed in the U.S. DoD CDAO Generative AI Responsible AI Toolkit; featured in NIST AI 100-2e2025; adopted as an official benchmark in all three editions of the FLI AI Safety Index to grade Anthropic, OpenAI, Google DeepMind, xAI, Meta, DeepSeek, Z.ai, and Alibaba Cloud; profiled by Lawrence Livermore National Laboratory (U.S. DOE/NNSA); chapter-length treatment in the Springer textbook Introduction to Foundation Models (2025, Ch. 12 "Trustworthiness Evaluation of Large Language Models"); supported by a Microsoft Accelerate Foundation Models Research Award; Invited Talk at IBM Research)

MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use

Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu, Qihui Zhang, Yixin Liu, Pan Zhou, et al.

The Twelfth International Conference on Learning Representations (ICLR 2024)

1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators?

Yue Huang*, Chenrui Fan*, Yuan Li, Siyuan Wu, et al.

The 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP 2024)

HonestLLM: Toward an Honest and Helpful Large Language Model

Chujie Gao*, Siyuan Wu*, Yue Huang*, Dongping Chen*, Qihui Zhang*, et al.

Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS 2024)

Talks & Guest Lecture

Jan. 2026
Conference Tutorial: Towards Trustworthy and Socially Responsible Generative Foundation Models
Dec. 2025
Conference Tutorial: Science of Trustworthy Generative Foundation Models
Nov. 2025
Conference Tutorial: Responsible Generative Foundation Models: From Principles to Real-World Impact
Nov. 2025
Conference Tutorial: Generative Models for Synthetic Data: Transforming Data Mining in the GenAI Era
Nov. 2025
Conference Tutorial: Socially Responsible and Trustworthy Generative Foundation Models: Principles, Challenges, and Practices
May. 2025
Toward Socially Impactful and Trustworthy Generative Foundation Models
@ University of Illinois Urbana-Champaign · Host: Heng Ji & Chi Han
Apr. 2025
On the Trustworthiness of Generative Foundation Models
Mar. 2025
Guest Lecture: Trustworthiness in Large Language Models
@ University of Virginia · Instructor: Chirag Agarwal
Feb. 2025
Guest Lecture: Toward Socially Impactful and Trustworthy Generative Foundation Models
@ University of Southern California · Instructor: Jieyu Zhao
Jul. 2024
Bias of Large Language Models
Feb. 2024
Trustworthiness in Large Language Models

Recognition & Impact

  • United States Department of Homeland Security (DHS). Highlighted TrustLLM: Trustworthiness in Large Language Models in this report.
  • International AI Safety Report. Cited TrustLLM: Trustworthiness in Large Language Models in both the 2025 edition and the 2026 edition — with the 2026 report elevating it to a core empirical source supporting the report's central finding on "persistent deficiencies on measures of truthfulness, safety, and robustness" (p. 99).
  • U.S. DoD CDAO Generative AI Responsible AI Toolkit. Listed TrustLLM: Trustworthiness in Large Language Models in the official toolkit.
  • NIST AI 100-2e2025. Featured TrustLLM: Trustworthiness in Large Language Models in Section 3.6, "Benchmarks for AML Vulnerabilities" of the official report.
  • FLI AI Safety Index. Adopted TrustLLM: Trustworthiness in Large Language Models as an official benchmark in all three editions: AI Safety Index 2024, AI Safety Index Summer 2025, and AI Safety Index Winter 2025.
  • Lawrence Livermore National Laboratory (U.S. DOE/NNSA). Profiled TrustLLM: Trustworthiness in Large Language Models in Evaluating trust and safety of large language models.
  • Springer textbook. Gave TrustLLM: Trustworthiness in Large Language Models chapter-length treatment in Introduction to Foundation Models (2025, Ch. 12).
  • Hoover Institution Technology Policy Accelerator. Featured TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models in the Technology Policy Accelerator.
  • Notre Dame-IBM Technology Ethics Lab. Selected TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models as a Collaborative Project.
  • Notre Dame-IBM Technology Ethics Lab. Highlighted Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge in an official highlight.
  • IBM tutorial video. Featured Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge in a dedicated tutorial video.
  • Science. Reported on Benchmarking large language models on safety risks in scientific laboratories in Leading AI models miss dangerous lab risks.
  • New Scientist. Highlighted Benchmarking large language models on safety risks in scientific laboratories in All major AI models risk encouraging dangerous science experiments.

Honors and Awards

Mar. 2026   CHI 2026 Honourable Mention Award
Dec. 2025   Tinker Research Grant
Nov. 2025   Jetstream2 NAIRR AI Fellow
Aug. 2025   NSF Discover ACCESS Project
Aug. 2025   NSF POSE Training Award (Role: Industry Mentor)
Jul. 2025   Best Paper Award of SciSocLLM@KDD’25
Jul. 2025   Best Paper Award of DIG-BUG@ICML’25
Jan. 2025   KAUST AI Rising Star
Jul. 2024   OpenAI's Researcher Access Program
Jun. 2024   Elite Student of School of Cyber Science and Engineering, Sichuan University (网安菁英)
Jan. 2024   Microsoft Accelerate Foundation Models Research

Academic Participation

Journal Reviewer
Nature Communication, IEEE Transactions on Artificial Intelligence (TAI), IEEE Transactions on Dependable and Secure Computing (TDSC), IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), ACM Transactions on Intelligent Systems and Technology (ACM TIST).
Conference Reviewer
NeurIPS (2025-), ICLR (2024-), ICML (2025-), AAAI (2025-), KDD (2025-), ICDM (2024-), WWW (2024-), COLM (2025-), ACL Rolling Review (2024-), EMNLP Demo Track (2024-), NAACL Demo Track (2025-), ACL Demo Track (2025-).
Technical Committee
Technical Committee Member of 2024 IEEE Computer Society North America Student Challenge.
Workshop Organizer
Organizer of NeurIPS 2025 Workshop on Socially Responsible and Trustworthy Foundation Models (ResponsibleFM).
Workshop Organizer
Organizer of KDD 2026 Workshop on Reliable Scientific Foundation Models: Design, Training, Grounding, and Verification (RelSciFM).

Educations

Sep. 2024
Present
Ph.D in Computer Science and Engineering
Sep. 2020
Jun. 2024
BEng. in Cybersecurity

Internships

Jun. 2026
Sep. 2026
Research Scientist Intern
Aug. 2025
Dec. 2025
Researcher
May. 2025
Aug. 2025
Research Intern
Sep. 2023
Jan. 2024
Research Intern

Acknowledgment

I am honored that my research is funded, supported, or recognized by:

Misc

  • I spent 18 years in my hometown, Fujian 🇨🇳, and had 4 wonderful years of university life in Sichuan 🌶️ (I can handle spicy food!).
  • I love exchanging ideas with people from different fields 🌍 — it helps me see the world more broadly.
  • My favorite singers are Eason Chan and Steve Chou;小剛 🎤. Lately, I've been listening to Patti Tsai.
  • My favorite sports are swimming 🏊 and badminton 🏸. I also enjoy capturing scenic moments with my camera 📷.
  • I'm deeply grateful to those who've helped me along the way 🙏 — thank you for helping me go further!