Curriculum Vitae

Shaoyi (Sean) Zheng — Ph.D. Student in Computer Science, NYU Courant

Education

Sep 2025 – Present

Ph.D. in Computer Science

Courant Institute of Mathematical Sciences, New York University
Advisor: Prof. Shengjie Wang

Aug 2021 – May 2025

B.S. in Computer Science

New York University Shanghai & New York University. Minor in Mathematics.

Research Interests

Efficient AI Generative Models Model Architectural Design Algorithmic Acceleration

Selected Publications

Research Projects

ToMA: Token Merge with Attention

ICML 2025 · Co-First Author

GPU-aligned token merging framework that reformulates merging as an attention-like linear transformation with invertible unmerge. Up to 1.4× speedup on SDXL without quality degradation.

Diffusion Models Token Merging Submodular Optimization

Hilbert Attention for Diffusion Models

arXiv preprint · First Author

Sparse attention via Hilbert curves preserving 2D spatial locality with custom Triton kernels. Achieves up to 4.17× speedup on Flux.1 with comparable quality.

Sparse Attention Hilbert Curve Triton Kernels

InfoFlow KV: Information-Flow-Aware KV Recomputation

arXiv preprint · Co-First Author

Casts selective KV recomputation as an information flow problem, using attention-norm signals to identify critical tokens for efficient long-context retrieval-augmented generation.

Long-Context LLM KV Cache Information Flow

Sub-CP: Context Selection for ICL

arXiv preprint · First Author

Submodular, block-aware context selection framework controlling a diversity–coherence spectrum for scalable in-context learning with consistent benchmark improvements.

In-Context Learning Submodular Optimization NLP

Industry Experience

May 2024 – Aug 2024

Tencent Technology — ML Engineer Intern

Built a 1M+ synthetic face dataset using SDXL + ControlNet + LoRA, accelerated generation by 40% via distributed pipelines, and contributed to fine-tuning a 1B-parameter multimodal anti-spoofing model on 8×H100 GPUs (97% accuracy).

May 2023 – Aug 2023

SenseTime Technology — ML Engineer Intern

Designed an 8M-sample dataset for Haitong Securities chatbot using advanced text augmentation (DeBERTaV3, RoBERTa/Sentence-BERT), improving accuracy by 7%.

Skills

Programming

  • Python
  • PyTorch
  • CUDA & Triton
  • C++

Machine Learning

  • Diffusion Models
  • Transformer Architectures
  • LoRA Fine-tuning
  • Dataset Construction

Tools

  • Git & GitHub
  • LaTeX
  • Linux
  • Distributed Training