neutrino12 's Collections Vision
updated
Omni-Effects: Unified and Spatially-Controllable Visual Effects
Generation
Paper
• 2508.07981
• Published
• 63
CharacterShot: Controllable and Consistent 4D Character Animation
Paper
• 2508.07409
• Published
• 39
ToonComposer: Streamlining Cartoon Production with Generative
Post-Keyframing
Paper
• 2508.10881
• Published
• 52
Puppeteer: Rig and Animate Your 3D Models
Paper
• 2508.10898
• Published
• 33
SeC: Advancing Complex Video Object Segmentation via Progressive Concept
Construction
Paper
• 2507.15852
• Published
• 38
Yume: An Interactive World Generation Model
Paper
• 2507.17744
• Published
• 91
Ultra3D: Efficient and High-Fidelity 3D Generation with Part Attention
Paper
• 2507.17745
• Published
• 36
Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings
Paper
• 2508.00632
• Published
• 4
Matrix-3D: Omnidirectional Explorable 3D World Generation
Paper
• 2508.08086
• Published
• 76
DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning
Paper
• 2508.05405
• Published
• 64
Tinker: Diffusion's Gift to 3D--Multi-View Consistent Editing From
Sparse Inputs without Per-Scene Optimization
Paper
• 2508.14811
• Published
• 42
Waver: Wave Your Way to Lifelike Video Generation
Paper
• 2508.15761
• Published
• 36
Matrix-Game 2.0: An Open-Source, Real-Time, and Streaming Interactive
World Model
Paper
• 2508.13009
• Published
• 25
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D
Space
Paper
• 2508.19247
• Published
• 43
ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for
Long-Horizon Tasks
Paper
• 2508.08240
• Published
• 45
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from
Pixels
Paper
• 2508.17437
• Published
• 37
MIDAS: Multimodal Interactive Digital-human Synthesis via Real-time
Autoregressive Video Generation
Paper
• 2508.19320
• Published
• 29
Mixture of Contexts for Long Video Generation
Paper
• 2508.21058
• Published
• 35
T2I-ReasonBench: Benchmarking Reasoning-Informed Text-to-Image
Generation
Paper
• 2508.17472
• Published
• 26
Do What? Teaching Vision-Language-Action Models to Reject the Impossible
Paper
• 2508.16292
• Published
• 9
ROSE: Remove Objects with Side Effects in Videos
Paper
• 2508.18633
• Published
• 7
Collaborative Multi-Modal Coding for High-Quality 3D Generation
Paper
• 2508.15228
• Published
• 4
MeshSplat: Generalizable Sparse-View Surface Reconstruction via Gaussian
Splatting
Paper
• 2508.17811
• Published
• 7
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion
Transformer Models
Paper
• 2509.17627
• Published
• 66