Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Posts

Future Blog Post

less than 1 minute read

Published: January 01, 2199

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published: August 14, 2015

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published: August 14, 2014

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published: August 14, 2013

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published: August 14, 2012

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

publications

Submodular Context Partitioning and Compression for In-Context Learning

Published in arXiv preprint, under review at ACL 2026 (Short Paper Track), 2024

Under review at ACL 2026 (Short Paper Track). Proposed Sub-CP, a submodular, block-aware context selection framework that controls a diversity–coherence spectrum for scalable in‑context learning. Designed four partition strategies—Global Diverse, Global–Local Diverse, Local Diverse, and Local Coherent—to balance global coverage and local structure. Integrated Sub-CP into DENSE, ICAE, and CEPE pipelines, yielding significant gains on datasets like TREC, SST‑2/5, MR, and AG News.

ToMA: Token Merge with Attention for Diffusion Models

Published in ICML 2025, PMLR 267:40930–40951, 2025

ToMA is a GPU‑aligned token merging framework for diffusion models, reformulating token merging as an attention‑like linear transformation with invertible unmerge to accelerate diffusion models without degrading quality, using submodular token selection and GPU‑efficient operations.

Recommended citation: Lu, W.*, Zheng, S.*, Xia, Y., & Wang, S. (2025). "ToMA: Token Merge with Attention for Diffusion Models." ICML 2025, PMLR 267:40930–40951.

Hilbert Attention for Image Generation with Diffusion Models

Published in arXiv preprint, under review at ICLR 2026, 2025

Under review at ICLR 2026. Proposed HilbertA, a sparse attention mechanism based on the Hilbert curve that jointly preserves 2D spatial locality and enables contiguous memory access, improving sparsity efficiency and memory throughput. Designed Hilbert-curve sparse attention with reordering, tiling, and sliding strategies to support local modeling and global information flow while maintaining coalesced GPU memory access and preserving image locality. Developed custom sparse attention kernel fusion in Triton and integrated LoRA fine-tuning to maximize information flow and computational efficiency. Achieved up to 4.17× speedup on Flux.1 with comparable image quality, demonstrating a superior speed–quality trade-off over dense and 2D sparse baselines.

Shaoyi(Sean) Zheng

Sitemap

Pages

Page Not Found

Welcome to Shaoyi(Sean) Zheng's Homepage

Archive Layout with Content

Posts by Category

Posts by Collection

CV

CV

Markdown

Page not in menu

Page Archive

Portfolio

Publications

Sitemap

Posts by Tags

Talk map

Talks and presentations

Teaching

Terms and Privacy Policy

Blog posts

Jupyter notebook markdown generator

Posts

Future Blog Post

Blog Post number 4

Blog Post number 3

Blog Post number 2

Blog Post number 1

publications

Submodular Context Partitioning and Compression for In-Context Learning

ToMA: Token Merge with Attention for Diffusion Models

Hilbert Attention for Image Generation with Diffusion Models