Follow Me...

Zhiqi Chen

Undergraduate Student at Tsinghua University, focusing on reinforcement learning for reasoning-heavy large language models.

Research focus: Reinforcement Learning for Reasoning
Recent highlight: NeurIPS 2025 oral presentation showing the limits of RL training in expanding reasoning capacity.
Collaboration: LeapLab, working with Prof. Gao Huang and interdisciplinary teams on long-horizon reasoning.

Contact Scholar CV

Profile

About Me

I am a third-year B.Eng. student in Automation at Tsinghua University, advised by Prof. Gao Huang in LeapLab. My previous research probed how reinforcement learning with verifiable rewards (RLVR) shapes reasoning behaviour in large language models. Currently, I am exploring further in reinforcement learning for reasoning.

Awards

2025
National Scholarship of China
2024
National Scholarship of China

Interests

Reinforcement Learning for Reasoning LLM & MLLM,
Embodied AI

Goal

Find the essence of reinforcement learning for reasoning and build a universal theory for reasoning.

Background

Education

Tsinghua University

B.Eng. in Automation · 2023 — present

GPA 3.99/4.00 (Rank 1/172).
Research Intern

LeapLab · 2025 — present

Core contributor to the program Limit-of-RLVR.

Selected Work

Publications

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Yang Yue*†, Zhiqi Chen*, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang
NeurIPS 2025 Best Paper Runner-Up Award · ICML 2025 Workshop AI4Math Best Paper Award
*Equal contribution · †Project lead

PDF arXiv Code