CommonLID Leaderboard

Results for the CommonLID and CommonLID-nano benchmarks. Headline metric: macro F1. Models are ranked by macro F1 within each tab; click a row to see per-language metrics.

📝 Blog post • 📄 Paper

CommonLID

Common Crawl's language identification benchmark, sampled from real-world web text and human-validated across hundreds of language varieties.

Reference • License: common-crawl-tou • Main score: macro_f1

commonlid — sorted by Macro F1

commonlid — sorted by Macro F1
Model	Macro F1	Micro F1	Mean FPR (%)	Languages	Samples/s
OpenLID-v2	60.4	70.4	0.07	1456	60383.4

Model	Macro F1	Micro F1	Mean FPR (%)	Languages	Samples/s
GlotLID	60.4	70.4	0.07	789	4519.5
cld2	49.5	80.4	0.15	186	60383.4
fasttext	49.3	66.3	0.28	226	13048.8
OpenLID-v2	47.4	64.0	0.32	215	12727.3
funlangid	46.4	67.4	0.04	1456	6791.9
pyfranc	39.3	59.4	0.24	417	558.9
cld3	34.3	74.6	0.25	150	17446.2
AfroLID	9.2	14.9	1.08	429	103.3

Click a row to load per-language metrics.

Language	F1	Precision	Recall	FPR (%)	GT	Predictions	Correct

commonlid_nano — sorted by Macro F1
Model	Macro F1	Micro F1	Mean FPR (%)	Languages	Samples/s
GPT-4o-mini	72.5	77.3	0.24	130	358120.3

Model	Macro F1	Micro F1	Mean FPR (%)	Languages	Samples/s
GPT-5	72.5	77.3	0.24	130	0.0
GlotLID	70.2	71.9	0.33	147	3794.3
GPT-5-mini	66.5	79.1	0.23	128	0.0
GPT-4o	65.9	78.7	0.22	135	0.0
OpenLID-v2	56.1	65.0	0.48	144	358120.3
cld2	54.2	74.3	0.37	112	381165.1
fasttext	54.1	64.2	0.50	135	9091.7
funlangid	53.8	64.2	0.26	292	6585.2
GPT-4o-mini	53.4	72.8	0.25	180	0.0
pyfranc	46.5	57.2	0.48	225	531.2
cld3	40.1	66.1	0.43	136	16345.1
AfroLID	10.5	16.4	2.72	161	70.7

Source: commoncrawl/commonlid-results @ HEAD.

CommonLID Leaderboard

CommonLID

CommonLID (nano)