Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training Paper • 2603.12255 • Published 4 days ago • 78
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 29 days ago • 52
fal/Qwen-Image-Edit-2511-Multiple-Angles-LoRA Image-to-Image • Updated Jan 7 • 53.4k • • 1.13k
PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence Paper • 2512.16793 • Published Dec 18, 2025 • 75
LongVie 2: Multimodal Controllable Ultra-Long Video World Model Paper • 2512.13604 • Published Dec 15, 2025 • 74