Attention
updated
Sparse VideoGen2: Accelerate Video Generation with Sparse Attention via
Semantic-Aware Permutation
Paper
• 2505.18875
• Published
• 42
PAROAttention: Pattern-Aware ReOrdering for Efficient Sparse and
Quantized Attention in Visual Generation Models
Paper
• 2506.16054
• Published
• 60
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference
Acceleration
Paper
• 2410.02367
• Published
• 50
Radial Attention: O(nlog n) Sparse Attention with Energy Decay for
Long Video Generation
Paper
• 2506.19852
• Published
• 42
nablaNABLA: Neighborhood Adaptive Block-Level Attention
Paper
• 2507.13546
• Published
• 125
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference
Paper
• 2502.18137
• Published
• 60
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable
Sparse-Linear Attention
Paper
• 2509.24006
• Published
• 118
SANA-Video: Efficient Video Generation with Block Linear Diffusion
Transformer
Paper
• 2509.24695
• Published
• 46
Why Low-Precision Transformer Training Fails: An Analysis on Flash
Attention
Paper
• 2510.04212
• Published
• 26
Native Hybrid Attention for Efficient Sequence Modeling
Paper
• 2510.07019
• Published
• 17
Sparser Block-Sparse Attention via Token Permutation
Paper
• 2510.21270
• Published
• 25
LiteAttention: A Temporal Sparse Attention for Diffusion Transformers
Paper
• 2511.11062
• Published
• 32
Fast Autoregressive Video Diffusion and World Models with Temporal Cache Compression and Sparse Attention
Paper
• 2602.01801
• Published
• 28
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning
Paper
• 2602.13515
• Published
• 43
SLA2: Sparse-Linear Attention with Learnable Routing and QAT
Paper
• 2602.12675
• Published
• 53