- [2025.08] Two first-author papers accepted to EMNLP 2025! One to Main Track and one to Findings!
- [2025.06] Our paper GuardReasoner-VL is accepted to ICML 2025 R2-FM Workshop!
- [2025.04] Our survey on LRMs Safety is on arxiv now, check out the paper and repo!
- [2025.01] One first-author paper is accepted to NAACL 2025 Main Conference.
- [2025.01] I started my internship at Tiktok as an Algorithm Engineer Intern.
- [2024.11] One first-author paper is accepted to COLING 2025.
|
📑 Pre-prints & Publications
* denotes equal contribution.
|
|
Unlocking the Pre-Trained Model as a Dual-Alignment Calibrator for Post-Trained LLMs
Cheng Wang*, Beier Luo*, Hongxin Wei, Yixuan Li, Xuefeng Du
Under Review, 2025
|
|
Mirage or Method? How Model-Task Alignment Induces Divergent RL Conclusions
Haoze Wu*, Cheng Wang*, Wenshuo Zhao, Junxian He
Under Review, 2025
Paper / Code
|
|
Neurips MechInterp Workshop (Spotlight)
False Sense of Security: Why Probing-based Malicious
Input Detection Fails to Generalize
Cheng Wang*, Zeming Wei*, Qin Liu, Wenxuan Zhou, Muhao Chen
Paper / Code
|
|
Taming Extreme Tokens: Covariance-Aware GRPO with Gaussian-Kernel Advantage Reweighting
Cheng Wang, Qin Liu, Wenxuan Zhou, Muhao Chen
Under Review, 2025
|
|
EMNLP 2025 MainWhen Audio and Text Disagree: Benchmarking Text Bias in Large Audio-Language Models under Cross-Modal Inconsistencies
Cheng Wang, Gelei Deng, Xianglin Yang, Tianwei Zhang
Paper / Code
|
|
NeurIPS 2025GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
Yue Liu, Shengfang Zhai, Mingzhe Du, Yulin Chen, Tri Cao, Hongcheng Gao, Cheng Wang, Xinfeng Li, Kun Wang, Junfeng Fang, Jiaheng Zhang, Bryan Hooi
Paper / Code
|
|
EMNLP 2025 FindingsSafety in Large Reasoning Models: A Survey
Cheng Wang*, Yue Liu, Baolong Bi, Duzhen Zhang, Zhongzhi Li, Junfeng Fang, Bryan Hooi
Paper / GitHub
|
|
NAACL 2025 MainTricking Retrievers with Influential Tokens: An Efficient Black-Box Corpus Poisoning Attack
Cheng Wang, Yiwei Wang, Yujun Cai, Bryan Hooi
Paper
|
|
COLING 2025Con-ReCall: Detecting Pre-training Data in LLMs via Contrastive Decoding
Cheng Wang, Yiwei Wang, Bryan Hooi, Yujun Cai, Nanyun Peng, Kai-Wei Chang
Paper / Code
|
|
National University of Singapore (NUS)
Period: 2022 - Present
Major: Computer Science & Math
|
💼 Professional & Industry Experience
|
|
Tiktok | Singapore
Algorithm Engineer Intern
Period: Jan 2025 - June 2025
|
|
National University of Singapore | Singapore
Teaching Assistant, Introduction to AI and Machine Learning
Period: Jan 2024 - May 2024
|
|