GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published Apr 8 • 121
ViVa: A Video-Generative Value Model for Robot Reinforcement Learning Paper • 2604.08168 • Published Apr 9 • 18
Small Vision-Language Models are Smart Compressors for Long Video Understanding Paper • 2604.08120 • Published Apr 9 • 21
MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping Paper • 2604.08364 • Published Apr 9 • 101