-
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 256 -
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper • 2502.03373 • Published • 58 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 126 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124
Chengyue
cyli
AI & ML interests
None yet
Organizations
None yet
reasoning
-
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model
Paper • 2502.02737 • Published • 256 -
Demystifying Long Chain-of-Thought Reasoning in LLMs
Paper • 2502.03373 • Published • 58 -
Kimi k1.5: Scaling Reinforcement Learning with LLMs
Paper • 2501.12599 • Published • 126 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 124
models 0
None public yet
datasets 0
None public yet