Publications

You can also find my articles on my Google Scholar profile.

Journal Articles

Hilbert Attention for Image Generation with Diffusion Models

Published in arXiv preprint, under review at ICLR 2026, 2025

Under review at ICLR 2026. Proposed HilbertA, a sparse attention mechanism based on the Hilbert curve that jointly preserves 2D spatial locality and enables contiguous memory access, improving sparsity efficiency and memory throughput. Designed Hilbert-curve sparse attention with reordering, tiling, and sliding strategies to support local modeling and global information flow while maintaining coalesced GPU memory access and preserving image locality. Developed custom sparse attention kernel fusion in Triton and integrated LoRA fine-tuning to maximize information flow and computational efficiency. Achieved up to 4.17× speedup on Flux.1 with comparable image quality, demonstrating a superior speed–quality trade-off over dense and 2D sparse baselines.

Submodular Context Partitioning and Compression for In-Context Learning

Published in arXiv preprint, under review at ACL 2026 (Short Paper Track), 2024

Under review at ACL 2026 (Short Paper Track). Proposed Sub-CP, a submodular, block-aware context selection framework that controls a diversity–coherence spectrum for scalable in‑context learning. Designed four partition strategies—Global Diverse, Global–Local Diverse, Local Diverse, and Local Coherent—to balance global coverage and local structure. Integrated Sub-CP into DENSE, ICAE, and CEPE pipelines, yielding significant gains on datasets like TREC, SST‑2/5, MR, and AG News.

Conference Papers

ToMA: Token Merge with Attention for Diffusion Models

Published in ICML 2025, PMLR 267:40930–40951, 2025

ToMA is a GPU‑aligned token merging framework for diffusion models, reformulating token merging as an attention‑like linear transformation with invertible unmerge to accelerate diffusion models without degrading quality, using submodular token selection and GPU‑efficient operations.

Recommended citation: Lu, W.*, Zheng, S.*, Xia, Y., & Wang, S. (2025). "ToMA: Token Merge with Attention for Diffusion Models." ICML 2025, PMLR 267:40930–40951.