Jcdbzh9olj's picture

Jcdbzh9olj

jcdbzh9olj

·

AI & ML interests

None yet

Recent Activity

liked a dataset 1 day ago

ryanmarten/OpenThoughts-1k-sample

liked a model 2 days ago

tencent/Hy-MT2-30B-A3B

upvoted a paper 3 days ago

OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under optimal Squared error quantization

View all activity

Organizations

None yet

upvoted a paper 3 days ago

OCTOPUS: Optimized KV Cache for Transformers via Octahedral Parametrization Under optimal Squared error quantization

Paper • 2605.21226 • Published 4 days ago • 9

upvoted a paper 5 days ago

CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence

Paper • 2605.12882 • Published 11 days ago • 263

upvoted a paper 17 days ago

OpenSeeker-v2: Pushing the Limits of Search Agents with Informative and High-Difficulty Trajectories

Paper • 2605.04036 • Published 19 days ago • 66

upvoted a paper 23 days ago

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Paper • 2604.15574 • Published Apr 16 • 23

upvoted 2 papers about 1 month ago

Diverse Dictionary Learning

Paper • 2604.17568 • Published Apr 19 • 3

RewardFlow: Generate Images by Optimizing What You Reward

Paper • 2604.08536 • Published Apr 9 • 6

upvoted 6 papers about 2 months ago

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published Apr 6 • 203

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published Apr 2 • 101

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 629

CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence

Paper • 2603.28032 • Published Mar 30 • 342

ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers

Paper • 2603.24414 • Published Mar 25 • 183

FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization

Paper • 2603.19835 • Published Mar 20 • 351

upvoted 2 papers 2 months ago

SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

Paper • 2602.12783 • Published Feb 13 • 246

Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Paper • 2603.04597 • Published Mar 4 • 210

upvoted 2 papers 3 months ago

A Very Big Video Reasoning Suite

Paper • 2602.20159 • Published Feb 23 • 523

Does Your Reasoning Model Implicitly Know When to Stop Thinking?

Paper • 2602.08354 • Published Feb 9 • 265