reasoning
updated
Training Large Language Models to Reason in a Continuous Latent Space
Paper
• 2412.06769
• Published
• 94
Scaling LLM Test-Time Compute Optimally can be More Effective than
Scaling Model Parameters
Paper
• 2408.03314
• Published
• 63
ICAL: Continual Learning of Multimodal Agents by Transforming
Trajectories into Actionable Insights
Paper
• 2406.14596
• Published
• 5
A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO,
DPO and More
Paper
• 2407.16216
• Published
Thinking LLMs: General Instruction Following with Thought Generation
Paper
• 2410.10630
• Published
• 20
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep
Thinking
Paper
• 2501.04519
• Published
• 288
TextGrad: Automatic "Differentiation" via Text
Paper
• 2406.07496
• Published
• 31
Accelerating Feedforward Computation via Parallel Nonlinear Equation
Solving
Paper
• 2002.03629
• Published
LIMO: Less is More for Reasoning
Paper
• 2502.03387
• Published
• 62