evaluation-datasets edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 14.9k • 37 TIGER-Lab/MMLU-Pro Benchmark • Updated 11 days ago • 12.1k • 127k • 458 CohereLabs/Global-MMLU Viewer • Updated Aug 14, 2025 • 602k • 10.8k • 150 Idavidrein/gpqa Benchmark • Updated 16 days ago • 1.25k • 106k • 398
smoltalk Contains smoltalk dataset in multiple minority languges. The dataset is useful in post-training a base model. rao254/smoltalk-kik Updated Nov 25, 2025 • 6 rao254/smoltalk-ja Viewer • Updated Nov 25, 2025 • 2.05k • 4
medical-datasets ruslanmv/ai-medical-chatbot Viewer • Updated Mar 23, 2024 • 257k • 1.63k • 246 michsethowusu/Code-170k-luo Viewer • Updated Oct 30, 2025 • 169k • 41 edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 14.9k • 37
evaluation-datasets edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 14.9k • 37 TIGER-Lab/MMLU-Pro Benchmark • Updated 11 days ago • 12.1k • 127k • 458 CohereLabs/Global-MMLU Viewer • Updated Aug 14, 2025 • 602k • 10.8k • 150 Idavidrein/gpqa Benchmark • Updated 16 days ago • 1.25k • 106k • 398
medical-datasets ruslanmv/ai-medical-chatbot Viewer • Updated Mar 23, 2024 • 257k • 1.63k • 246 michsethowusu/Code-170k-luo Viewer • Updated Oct 30, 2025 • 169k • 41 edinburgh-dawg/mmlu-redux-2.0 Viewer • Updated Feb 25, 2025 • 5.7k • 14.9k • 37
smoltalk Contains smoltalk dataset in multiple minority languges. The dataset is useful in post-training a base model. rao254/smoltalk-kik Updated Nov 25, 2025 • 6 rao254/smoltalk-ja Viewer • Updated Nov 25, 2025 • 2.05k • 4