Zhenyu "Allen" Zhang

Education

Ph.D. in The University of Texas at Austin
Sep. 2022 - May. 2026
Ph.D. in Electrical and Computer Engineering
Advised by Prof. Atlas Wang.

M.E. in University of Science and Technology of China
Sep 2019 - May 2022
M.E. in Electrical and Computer Engineering
Advised by Prof. Bin Li.

B.S. in University of Science and Technology of China
Sep 2015 - May 2019
B.S. in Applied Physics
Yan Ji-Ci Talent Program in Physics

Selected Publication

SEAL: Steerable Reasoning Calibration of Large Language Models for Free

Runjin Chen*, Zhenyu Zhang*, Junyuan Hong, Souvik Kundu, Zhangyang Wang (* Equal Contribution)

Conference on Language Modeling (COLM), 2025

[Paper] [Project] [Code] [Intel]

APOLLO: SGD-like Memory, AdamW-level Performance

Hanqing Zhu*, Zhenyu Zhang*, Wenyan Cong, Xi Liu, Sem Park, Vikas Chandra, Bo Long, David Z. Pan, Zhangyang Wang, Jinwon Lee (* Equal Contribution)

Proceedings of Machine Learning and Systems (MLSys Outstanding Paper Award Honorable Mention), 2025

[Paper] [Project] [Code] [Hacker News] [LLaMA-Factory] [FluxML] [HuggingFace] [Axolotl] [Video] [机器之心]

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

International Conference on Machine Learning (ICML Oral), 2024

[Paper] [Project] [Code] [Hacker News] [HuggingFace] [LLaMA-Factory] [Axolotl] [AICoffeeBreak] [机器之心]

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

Neural Information Processing Systems (NeurIPS), 2023

[Paper] [Project] [Code] [Meta/llama-recipes] [Answer.AI] [Talk] [新智元]

Work Experience

xAI - Member of Technical Staff (Mar. 2026 - Present)

Working on Grok reasoning.

Citadel Securities - Quantitative Research Intern (Sep. 2025 - Dec. 2025)

Research on GenAI modeling for alpha prediction.
Work with Dr. Wujie Huang.

Google DeepMind - Student Researcher (May. 2025 - Aug. 2025)

Research on mechanistic interpretability of reasoning models.
Work with Dr. Shujian Zhang, Dr. John Lambert and Dr. Lun Wang.

Together AI - Research Consultant (Feb. 2025 - May. 2025)

Research on efficient reasoning via activation engineering.
Work with Dr. Shirley Wu, Dr. Qingyang Wu and Dr. Ben Athiwaratkun.

Intel Labs - Research Scientist Intern (Sep. 2024 - Jan. 2025)

Conducted research on building long context foundation models.
Work with Dr. Souvik Kundu, Dr. Mostafa Hesham.

Meta Reality Labs - Research Scientist Intern (May 2024 - Aug. 2024)

Conduct research on efficient LLM inference by activation sparsity.
Work with Dr. Steven Li, Dr. Zechun Liu, Dr. Yuandong Tian.

Microsoft Research - Research Scientist Intern (Oct. 2023 - Feb. 2024)

Conduct research on enhancing context awareness of long-context LLMs.
Work with Dr. Zhewei Yao, Dr. Xiaoxia Wu.

Lawrence Livermore National Laboratory - Research Scientist Intern (May 2023 - Oct. 2023)

Conduct research on KV cache compression for efficient LLM inference.
Work with Dr. Bhavya Kailkhura, Dr. Brian Bartoldson, Dr. James Diffenderfer.

Hi there! I'm Zhenyu

My research focuses on developing personalized and self-improvable models: (i) Model Reasoning (RL, Test-time Scaling) (ii) Long-Horizon Agentic Optimization, and (iii) Personalized and Efficient Training/Inference for GenAI

I was recognized as a CPAL Rising Star in 2026 and an ML and Systems Rising Star in 2025, and received the MLSys'25 Outstanding Paper Award (Honorable Mention), along with several travel and reviewer awards from prestigious conferences.

Education

Ph.D. in The University of Texas at Austin
Sep. 2022 - May. 2026
Ph.D. in Electrical and Computer Engineering
Advised by Prof. Atlas Wang.

M.E. in University of Science and Technology of China
Sep 2019 - May 2022
M.E. in Electrical and Computer Engineering
Advised by Prof. Bin Li.

B.S. in University of Science and Technology of China
Sep 2015 - May 2019
B.S. in Applied Physics
Yan Ji-Ci Talent Program in Physics

Selected Publication

SEAL: Steerable Reasoning Calibration of Large Language Models for Free

APOLLO: SGD-like Memory, AdamW-level Performance

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Work Experience

xAI - Member of Technical Staff (Mar. 2026 - Present)

Citadel Securities - Quantitative Research Intern (Sep. 2025 - Dec. 2025)

Google DeepMind - Student Researcher (May. 2025 - Aug. 2025)

Together AI - Research Consultant (Feb. 2025 - May. 2025)

Intel Labs - Research Scientist Intern (Sep. 2024 - Jan. 2025)

Meta Reality Labs - Research Scientist Intern (May 2024 - Aug. 2024)

Microsoft Research - Research Scientist Intern (Oct. 2023 - Feb. 2024)

Lawrence Livermore National Laboratory - Research Scientist Intern (May 2023 - Oct. 2023)

Hi there! I'm Zhenyu

My research focuses on developing personalized and self-improvable models: (i) Model Reasoning (RL, Test-time Scaling) (ii) Long-Horizon Agentic Optimization, and (iii) Personalized and Efficient Training/Inference for GenAI

I was recognized as a CPAL Rising Star in 2026 and an ML and Systems Rising Star in 2025, and received the MLSys'25 Outstanding Paper Award (Honorable Mention), along with several travel and reviewer awards from prestigious conferences.

Education

Ph.D. in The University of Texas at Austin Sep. 2022 - May. 2026 Ph.D. in Electrical and Computer Engineering Advised by Prof. Atlas Wang.

M.E. in University of Science and Technology of China Sep 2019 - May 2022 M.E. in Electrical and Computer Engineering Advised by Prof. Bin Li.

B.S. in University of Science and Technology of China Sep 2015 - May 2019 B.S. in Applied Physics Yan Ji-Ci Talent Program in Physics

Selected Publication

SEAL: Steerable Reasoning Calibration of Large Language Models for Free

APOLLO: SGD-like Memory, AdamW-level Performance

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Work Experience

Ph.D. in The University of Texas at Austin
Sep. 2022 - May. 2026
Ph.D. in Electrical and Computer Engineering
Advised by Prof. Atlas Wang.

M.E. in University of Science and Technology of China
Sep 2019 - May 2022
M.E. in Electrical and Computer Engineering
Advised by Prof. Bin Li.

B.S. in University of Science and Technology of China
Sep 2015 - May 2019
B.S. in Applied Physics
Yan Ji-Ci Talent Program in Physics