8 11 13

Yichen You

youyc22

youyc22

AI & ML interests

None yet

Recent Activity

authored a paper 7 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

upvoted a paper 7 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

upvoted a paper about 1 month ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

View all activity

Organizations

authored a paper 7 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 9 days ago • 30

upvoted a paper 7 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 9 days ago • 30

upvoted a paper about 1 month ago

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 107

updated a dataset about 1 month ago

youyc22/amteam-8b-121k-top16

Viewer • Updated Apr 12 • 83.9k • 14

published a dataset about 1 month ago

youyc22/amteam-8b-121k-top16

Viewer • Updated Apr 12 • 83.9k • 14

updated a collection about 1 month ago

TaH

Collection

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models • 9 items • Updated Apr 12 • 2

published a dataset about 1 month ago

youyc22/amteam-121k-8k

Updated Apr 11 • 29

updated a dataset about 1 month ago

youyc22/amteam-121k-8k

Updated Apr 11 • 29

updated a collection about 2 months ago

TaH

Collection

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models • 9 items • Updated Apr 12 • 2

updated a collection 2 months ago

TaH

Collection

Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models • 9 items • Updated Apr 12 • 2

upvoted a paper 3 months ago

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 59

liked a dataset 4 months ago

Alibaba-Apsara/Superior-Reasoning-SFT-gpt-oss-120b

Viewer • Updated Jan 31 • 306k • 1.51k • 348

published a model 4 months ago

nics-efc/Standard-1.7B

Text Generation • 2B • Updated Jan 12 • 2

updated a model 4 months ago

nics-efc/Standard-1.7B

Text Generation • 2B • Updated Jan 12 • 2

liked 2 models 5 months ago

Nanbeige/Nanbeige4-3B-Thinking-2511

Text Generation • 4B • Updated Dec 17, 2025 • 1.04k • 206

openai/circuit-sparsity

Text Generation • 0.4B • Updated Dec 12, 2025 • 544 • 209

upvoted an article 6 months ago

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 396

Yichen You

AI & ML interests

Recent Activity

Organizations

youyc22's activity

Continuous batching from first principles