Cheng Wang

Hi, I am Cheng Wang (王程). I'm a final-year undergraduate student from National University of Singapore (NUS). I am broadly interested in Trustworthy AI, LLM Reasoning and Agents. I am fortunate to work closely with Prof. Bryan Hooi and Prof. Tat-Seng Chua from NUS, Prof. Tianwei Zhang from NTU, Prof. Junxian He from HKUST, Prof. Muhao Chen from UC Davis, and Prof. Kai-Wei Chang from UCLA.

My primary research interests include:

  • Trustworthy AI: Hallucination Detection, Calibration & Adversarial Robustness.
  • AI Reasoning: Enhancing reasoning abilities of LLMs.
  • LLM Applications: Autonomous agents & RAG systems.

Email  /  Google Scholar  /  LinkedIn  /  Github  /  WeChat

Looking for Fall 2026 PhD opportunities on Trustworthy AI, LLM Reasoning and Agents, feel free to contact!

🔥 News
  • [2025.08] Two first-author papers accepted to EMNLP 2025! One to Main Track and one to Findings!
  • [2025.06] Our paper GuardReasoner-VL is accepted to ICML 2025 R2-FM Workshop!
  • [2025.04] Our survey on LRMs Safety is on arxiv now, check out the paper and repo!
  • [2025.01] One first-author paper is accepted to NAACL 2025 Main Conference.
  • [2025.01] I started my internship at Tiktok as an Algorithm Engineer Intern.
  • [2024.11] One first-author paper is accepted to COLING 2025.
📑 Pre-prints & Publications

* denotes equal contribution.

survey
Unlocking the Pre-Trained Model as a Dual-Alignment Calibrator for Post-Trained LLMs
Cheng Wang*, Beier Luo*, Hongxin Wei, Yixuan Li, Xuefeng Du
Under Review, 2025

survey
Mirage or Method? How Model-Task Alignment Induces Divergent RL Conclusions
Haoze Wu*, Cheng Wang*, Wenshuo Zhao, Junxian He
Under Review, 2025
Paper / Code
survey
Neurips MechInterp Workshop (Spotlight) False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize
Cheng Wang*, Zeming Wei*, Qin Liu, Wenxuan Zhou, Muhao Chen
Paper / Code
survey
Taming Extreme Tokens: Covariance-Aware GRPO with Gaussian-Kernel Advantage Reweighting
Cheng Wang, Qin Liu, Wenxuan Zhou, Muhao Chen
Under Review, 2025

survey
EMNLP 2025 MainWhen Audio and Text Disagree: Benchmarking Text Bias in Large Audio-Language Models under Cross-Modal Inconsistencies
Cheng Wang, Gelei Deng, Xianglin Yang, Tianwei Zhang
Paper / Code
survey
NeurIPS 2025GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Yue Liu, Shengfang Zhai, Mingzhe Du, Yulin Chen, Tri Cao, Hongcheng Gao, Cheng Wang, Xinfeng Li, Kun Wang, Junfeng Fang, Jiaheng Zhang, Bryan Hooi
Paper / Code
survey
EMNLP 2025 FindingsSafety in Large Reasoning Models: A Survey
Cheng Wang*, Yue Liu, Baolong Bi, Duzhen Zhang, Zhongzhi Li, Junfeng Fang, Bryan Hooi
Paper / GitHub
DIGA
NAACL 2025 MainTricking Retrievers with Influential Tokens: An Efficient Black-Box Corpus Poisoning Attack
Cheng Wang, Yiwei Wang, Yujun Cai, Bryan Hooi
Paper
con-recall
COLING 2025Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Cheng Wang, Yiwei Wang, Bryan Hooi, Yujun Cai, Nanyun Peng, Kai-Wei Chang Paper / Code
🎓 Education
nus National University of Singapore (NUS)
Period: 2022 - Present
Major: Computer Science & Math
💼 Professional & Industry Experience
tiktok Tiktok | Singapore
Algorithm Engineer Intern
Period: Jan 2025 - June 2025
nus National University of Singapore | Singapore
Teaching Assistant, Introduction to AI and Machine Learning
Period: Jan 2024 - May 2024

Last update: Sep 2025