Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
publications
Submodular Context Partitioning and Compression for In-Context Learning
Published in arXiv preprint, under review at ACL 2026 (Short Paper Track), 2024
Under review at ACL 2026 (Short Paper Track). Proposed Sub-CP, a submodular, block-aware context selection framework that controls a diversity–coherence spectrum for scalable in‑context learning. Designed four partition strategies—Global Diverse, Global–Local Diverse, Local Diverse, and Local Coherent—to balance global coverage and local structure. Integrated Sub-CP into DENSE, ICAE, and CEPE pipelines, yielding significant gains on datasets like TREC, SST‑2/5, MR, and AG News.
ToMA: Token Merge with Attention for Diffusion Models
Published in ICML 2025, PMLR 267:40930–40951, 2025
ToMA is a GPU‑aligned token merging framework for diffusion models, reformulating token merging as an attention‑like linear transformation with invertible unmerge to accelerate diffusion models without degrading quality, using submodular token selection and GPU‑efficient operations.
Recommended citation: Lu, W.*, Zheng, S.*, Xia, Y., & Wang, S. (2025). "ToMA: Token Merge with Attention for Diffusion Models." ICML 2025, PMLR 267:40930–40951.
Hilbert Attention for Image Generation with Diffusion Models
Published in arXiv preprint, under review at ICLR 2026, 2025
Under review at ICLR 2026. Proposed HilbertA, a sparse attention mechanism based on the Hilbert curve that jointly preserves 2D spatial locality and enables contiguous memory access, improving sparsity efficiency and memory throughput. Designed Hilbert-curve sparse attention with reordering, tiling, and sliding strategies to support local modeling and global information flow while maintaining coalesced GPU memory access and preserving image locality. Developed custom sparse attention kernel fusion in Triton and integrated LoRA fine-tuning to maximize information flow and computational efficiency. Achieved up to 4.17× speedup on Flux.1 with comparable image quality, demonstrating a superior speed–quality trade-off over dense and 2D sparse baselines.
