Papers
arxiv:2604.18106

Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion

Published on Apr 20
Authors:
,
,
,

Abstract

TriMix is a test-time logit fusion framework that efficiently adapts large language models to low-resource languages by dynamically balancing competencies from specialized small models, high-resource instruction-tuned models, and large models without requiring task-specific annotations.

AI-generated summary

Adapting large language models (LLMs) to low-resource languages (LRLs) is constrained by the scarcity of task data and computational resources. Although Proxy Tuning offers a logit-level strategy for introducing scaling effects, it often fails in LRL settings because the large model's weak LRL competence might overwhelm the knowledge of specialized smaller models. We thus propose TriMix, a test-time logit fusion framework that dynamically balances capabilities from three different sources: LRL competence from a continually pretrained small model, task competence from high-resource language instruction tuning, and the scaling benefits of large models. It is data- and compute-efficient, requiring no LRL task annotations, and only continual pretraining on a small model. Experiments across four model families and eight LRLs show that TriMix consistently outperforms single-model baselines and Proxy Tuning. Our analysis reveals that prioritizing the small LRL-specialized model's logits is crucial for success, challenging the prevalent large-model-dominant assumption.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2604.18106
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 12

Browse 12 models citing this paper

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2604.18106 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2604.18106 in a Space README.md to link it from this page.

Collections including this paper 1