Building a Precise Video Language with Human-AI Oversight Paper • 2604.21718 • Published 7 days ago • 14
Agentic World Modeling: Foundations, Capabilities, Laws, and Beyond Paper • 2604.22748 • Published 5 days ago • 208
World-R1: Reinforcing 3D Constraints for Text-to-Video Generation Paper • 2604.24764 • Published 2 days ago • 108
Seeing Fast and Slow: Learning the Flow of Time in Videos Paper • 2604.21931 • Published 6 days ago • 19
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published 6 days ago • 36
Extending One-Step Image Generation from Class Labels to Text via Discriminative Text Representation Paper • 2604.18168 • Published 9 days ago • 97
SWE-chat: Coding Agent Interactions From Real Users in the Wild Paper • 2604.20779 • Published 7 days ago • 13
CoInteract: Physically-Consistent Human-Object Interaction Video Synthesis via Spatially-Structured Co-Generation Paper • 2604.19636 • Published 8 days ago • 86
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published 9 days ago • 27
MultiWorld: Scalable Multi-Agent Multi-View Video World Models Paper • 2604.18564 • Published 9 days ago • 43
Agent-World: Scaling Real-World Environment Synthesis for Evolving General Agent Intelligence Paper • 2604.18292 • Published 9 days ago • 81
HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds Paper • 2604.14268 • Published 14 days ago • 116
Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published 14 days ago • 153
FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios Paper • 2604.07413 • Published 21 days ago • 95
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published 19 days ago • 48
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 20 days ago • 242