Shaoyi (Sean) Zheng — Ph.D. Student in Computer Science, NYU Courant
Courant Institute of Mathematical Sciences, New York University
Advisor: Prof. Shengjie Wang
New York University Shanghai & New York University. Minor in Mathematics.
Proposed a method that casts selective KV recomputation as an information flow problem, using attention-norm signals to identify tokens that are both semantically relevant and...
Read more → arXiv preprintProposed HilbertA, a sparse attention mechanism based on the Hilbert curve that jointly preserves 2D spatial locality and enables contiguous memory access, improving sparsity efficiency...
Read more → ICML 2025, PMLR 267:40930–40951ToMA is a GPU‑aligned token merging framework for diffusion models, reformulating token merging as an attention‑like linear transformation with invertible unmerge to accelerate diffusion models...
Read more → arXiv preprintProposed Sub-CP, a submodular, block-aware context selection framework that controls a diversity–coherence spectrum for scalable in‑context learning. Designed four partition strategies—Global Diverse, Global–Local Diverse, Local...
Read more →GPU-aligned token merging framework that reformulates merging as an attention-like linear transformation with invertible unmerge. Up to 1.4× speedup on SDXL without quality degradation.
Sparse attention via Hilbert curves preserving 2D spatial locality with custom Triton kernels. Achieves up to 4.17× speedup on Flux.1 with comparable quality.
Casts selective KV recomputation as an information flow problem, using attention-norm signals to identify critical tokens for efficient long-context retrieval-augmented generation.
Submodular, block-aware context selection framework controlling a diversity–coherence spectrum for scalable in-context learning with consistent benchmark improvements.
Built a 1M+ synthetic face dataset using SDXL + ControlNet + LoRA, accelerated generation by 40% via distributed pipelines, and contributed to fine-tuning a 1B-parameter multimodal anti-spoofing model on 8×H100 GPUs (97% accuracy).
Designed an 8M-sample dataset for Haitong Securities chatbot using advanced text augmentation (DeBERTaV3, RoBERTa/Sentence-BERT), improving accuracy by 7%.