Papers
arxiv:2503.01150

MiLiC-Eval: Benchmarking Multilingual LLMs for China's Minority Languages

Published on Mar 3, 2025
Authors:
,
,
,

Abstract

MiLiC-Eval, a benchmark for minority languages in China, identifies the challenges faced by large language models in handling syntax-intensive tasks and diverse writing systems.

AI-generated summary

Large language models (LLMs) excel in high-resource languages but struggle with low-resource languages (LRLs), particularly those spoken by minority communities in China, such as Tibetan, Uyghur, Kazakh, and Mongolian. To systematically track the progress in these languages, we introduce MiLiC-Eval, a benchmark designed for minority languages in China, featuring 24K instances across 9 tasks. MiLiC-Eval focuses on underrepresented writing systems and provides a fine-grained assessment of linguistic and problem-solving skills. Our evaluation reveals that LLMs perform poorly on syntax-intensive tasks and multi-script languages. We further demonstrate how MiLiC-Eval can help advance LRL research in handling diverse writing systems and understanding the process of language adaptation.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2503.01150
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2503.01150 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2503.01150 in a Space README.md to link it from this page.

Collections including this paper 1