RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published 5 days ago • 99
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 9 days ago • 237
OptiMer: Optimal Distribution Vector Merging Is Better than Data Mixing for Continual Pre-Training Paper • 2603.28858 • Published 18 days ago • 9
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published 29 days ago • 337
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Paper • 2603.25716 • Published 22 days ago • 154
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training Paper • 2602.10693 • Published Feb 11 • 220
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published Feb 13 • 216
NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control Paper • 2602.09070 • Published Feb 9 • 46