QRQ's picture

QRQ

RichardQRQ

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

upvoted a paper 8 days ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

upvoted a paper 19 days ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

View all activity

Organizations

None yet

upvoted a paper 4 days ago

Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models

Paper • 2604.08545 • Published 6 days ago • 40

upvoted a paper 8 days ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published 9 days ago • 116

upvoted a paper 19 days ago

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published 20 days ago • 130

upvoted a paper 25 days ago

HopChain: Multi-Hop Data Synthesis for Generalizable Vision-Language Reasoning

Paper • 2603.17024 • Published 28 days ago • 109

upvoted 11 papers about 1 month ago

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Paper • 2603.12180 • Published Mar 12 • 65

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published Mar 12 • 91

OpenClaw-RL: Train Any Agent Simply by Talking

Paper • 2603.10165 • Published Mar 10 • 151

Thinking to Recall: How Reasoning Unlocks Parametric Knowledge in LLMs

Paper • 2603.09906 • Published Mar 10 • 75

Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing

Paper • 2603.03143 • Published Mar 3 • 145

InternVL-U: Democratizing Unified Multimodal Models for Understanding, Reasoning, Generation and Editing

Paper • 2603.09877 • Published Mar 10 • 48

Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Paper • 2603.09095 • Published Mar 10 • 29

Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion

Paper • 2603.06577 • Published Mar 6 • 49

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Paper • 2602.24286 • Published Feb 27 • 98

Enhancing Spatial Understanding in Image Generation via Reward Modeling

Paper • 2602.24233 • Published Feb 27 • 59

dLLM: Simple Diffusion Language Modeling

Paper • 2602.22661 • Published Feb 26 • 152

upvoted 3 papers about 2 months ago

OmniGAIA: Towards Native Omni-Modal AI Agents

Paper • 2602.22897 • Published Feb 26 • 53

On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published Feb 24 • 102

PyVision-RL: Forging Open Agentic Vision Models via RL

Paper • 2602.20739 • Published Feb 24 • 31

upvoted 2 papers 2 months ago

Towards Autonomous Mathematics Research

Paper • 2602.10177 • Published Feb 10 • 36

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Paper • 2602.10604 • Published Feb 11 • 194