Papers·5일 전
COMPASS: Continual multilingual PEFT with adaptive semantic sampling outperforms baselines on Global-MMLU and MMLU-ProX

UC Berkeley introduces COMPASS, a data-centric framework for multilingual LLM adaptation that trains lightweight language-specific adapters via PEFT. Its distribution-aware sampling strategy uses multilingual embeddings and clustering to prioritize auxiliary data from under-represented semantic clusters, maximizing positive cross-lingual transfer. Across Phi-4-Mini, Llama-3.1-8B, and Qwen2.5-7B on Global-MMLU, MMLU-ProX, and long-context OneRuler, COMPASS consistently outperforms linguistic-similarity baselines, with an extension COMPASS-ECDA for continual learning under distribution shift.
- #multilingual
- #peft
- #continual-learning
- #uc-berkeley
University of California, Berkeley