Zhenyu "Allen" Zhang

Hi there! I'm a third-year Ph.D. student at UT Austin, advised by Prof. Zhangyang "Atlas" Wang. I am also collaborating with Prof. Beidi Chen at CMU and Dr. Yuandong Tian at Meta. My reserach focuses on efficient and reliable machine learning systems, specifically in the following topics:

Efficient training and inference for large foundation models;
Long context multimodal modeling;
Transformer circuits & Quantum machine learning.

/ / / /

News

[Sep. 2023] One NeurIPS'24 accepted, Found-in-the-Middle.
[Sep. 2024] Start working as a research scientist intern at Intel Labs.
[May. 2024] Start working as a research scientist intern at Meta Reality Labs.
[May. 2024] Five ICML'24 accepted, GaLore, OWL, CaM, LESS, Sparse Cocktail.
[Apr. 2024] Grateful to be awareded the MLSys'24 Student Travel Grant.
[Feb. 2024] One MLSys'24 accepted, Q-Hitter: Sparse-quantized KV cache.
[Jan. 2024] Two ICLR'24 accepted, JoMA: training dynamics of LLMs, and SMoE merging.
[Dec. 2023] One AAAI'24 accepted, Sparsity-guided concept bottleneck models.
[Oct. 2023] Start working as a research scientist intern at Microsoft Research.
[Sep. 2023] One NeurIPS'23 accepted, LLM heavy-hitter oracle.

Selected Publications (full list)

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Jiawei Zhao, Zhenyu Zhang, Beidi Chen, Zhangyang Wang, Anima Anandkumar, Yuandong Tian

ICML 2024 Oral / Paper / Code / Hacker News / HuggingFace / LLaMA-Factory / Axolotl / AICoffeeBreak

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

Zhenyu Zhang, Ying Sheng, Tianyi Zhou, Tianlong Chen, Lianmin Zheng, Ruisi Cai, Zhao Song, Yuandong Tian, Christopher Ré, Clark Barrett, Zhangyang Wang, Beidi Chen

NeurIPS 2023 / Paper / Blog / Code / llama-recipes / Media (AI era/新智元)

Q-Hitter: A Better Token Oracle for Efficient LLM Inference via Sparse-Quantized KV Cache

Zhenyu Zhang*, Shiwei Liu*, Runjin Chen, Bhavya Kailkhura, Beidi Chen, Zhangyang Wang

MLSys 2024 / Paper / Code

Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy

Pingzhi Li, Zhenyu Zhang, Prateek Yadav, Yi-Lin Sung, Yu Cheng, Mohit Bansal, Tianlong Chen

ICLR 2024 Spotlight / Paper / Code

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention

Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Du

ICLR 2024 / Paper

Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers

Tianlong Chen*, Zhenyu Zhang*, Ajay Jaiswal, Shiwei Liu, Zhangyang Wang

ICLR 2023 Spotlight / Paper / Code

Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!

Shiwei Liu*, Tianlong Chen*, Zhenyu Zhang, Xuxi Chen, Tianjin Huang, Ajay Jaiswal, Zhangyang Wang

ICLR 2023 Spotlight / Paper / Code

Sparse Winning Tickets are Data-Efficient Image Recognizers

Mukund Varma T, Xuxi Chen, Zhenyu Zhang, Tianlong Chen, Subhashini Venugopalan, Zhangyang Wang

NeurIPS 2022 Spotlight / Paper / Code

Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets

Ruisi Cai*, Zhenyu Zhang*, Tianlong Chen, Xiaohan Chen, Zhangyang Wang

NeurIPS 2022 / Paper / Code

Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free

Tianlong Chen*, Zhenyu Zhang*, Yihua Zhang*, Shiyu Chang, Sijia Liu, Zhangyang Wang

CVPR 2022 / Paper / Code

Sparsity Winning Twice: Better Robust Generalization from More Efficient Training

Tianlong Chen*, Zhenyu Zhang*, Pengjun Wang*, Santosh Balachandra*, Haoyu Ma*, Zehao Wang, Zhangyang Wang

ICLR 2022 / Paper / Code

Efficient Lottery Ticket Finding: Less Data is More

Zhenyu Zhang*, Xuxi Chen*, Tianlong Chen*, Zhangyang Wang

ICML 2021 / Paper / Code

Robust Overfitting May be Mitigated by Properly Learned Smoothening

Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, Shiyu Chang, Zhangyang Wang

ICLR 2021 / Paper / Code

Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning

Tianlong Chen*, Zhenyu Zhang*, Sijia Liu, Shiyu Chang, Zhangyang Wang

ICLR 2021 / Paper / Code

GANs Can Play Lottery Tickets Too

Xuxi Chen*, Zhenyu Zhang*, Yongduo Sui, Tianlong Chen

ICLR 2021 / Paper / Code

Work Experience

	Intel Labs Reserach Scientist Intern, Sep. 2024 - Jan. 2025 Mentor: Dr. Souvik Kundu, Dr. Mostafa Hesham
	Meta Reality Labs Reserach Scientist Intern, May. 2024 - Aug. 2024 Mentor: Dr. Steven Li, Dr. Zechun Liu, Dr. Yuandong Tian
	Microsoft Research Reserach Scientist Intern, Sep. 2023 - Feb. 2024 Mentor: Dr. Zhewei Yao, Dr. Xiaoxia Wu
	Lawrence Livermore National Laboratory Reserach Scientist Intern, May. 2023 - Aug. 2023 Mentor: Dr. Bhavya Kailkhura, Dr. Brian Bartoldson, Dr. James Diffenderfer

Services

Invited Conference Reviewer: NeurIPS, ICLR, ICML, CVPR, ICCV, ECCV, AISTATS, EMNLP, AAAI, ICIP, ICME, CPAL, ACCV
Invited Journal Reviewer: TNNLS and JMLR

News

[Sep. 2023] One NeurIPS'24 accepted, Found-in-the-Middle.

[Sep. 2024] Start working as a research scientist intern at Intel Labs.

[May. 2024] Start working as a research scientist intern at Meta Reality Labs.

[May. 2024] Five ICML'24 accepted, GaLore, OWL, CaM, LESS, Sparse Cocktail.

[Apr. 2024] Grateful to be awareded the MLSys'24 Student Travel Grant.

[Feb. 2024] One MLSys'24 accepted, Q-Hitter: Sparse-quantized KV cache.

[Jan. 2024] Two ICLR'24 accepted, JoMA: training dynamics of LLMs, and SMoE merging.

[Dec. 2023] One AAAI'24 accepted, Sparsity-guided concept bottleneck models.

[Oct. 2023] Start working as a research scientist intern at Microsoft Research.

[Sep. 2023] One NeurIPS'23 accepted, LLM heavy-hitter oracle.

Selected Publications (full list)

Work Experience

Services