Do llamas work in english? on the latent language of multilingual transformers

C Wendler, V Veselovsky, G Monea… - Proceedings of the 62nd …, 2024 - aclanthology.org
We ask whether multilingual language models trained on unbalanced, English-dominated
corpora use English as an internal pivot language—-a question of key importance for …

On the multilingual ability of decoder-based pre-trained language models: Finding and controlling language-specific neurons

T Kojima, I Okimura, Y Iwasawa, H Yanaka… - arxiv preprint arxiv …, 2024 - arxiv.org
Current decoder-based pre-trained language models (PLMs) successfully demonstrate
multilingual capabilities. However, it is unclear how these models handle multilingualism …

[HTML][HTML] Extracting sentence embeddings from pretrained transformer models

L Stankevičius, M Lukoševičius - Applied Sciences, 2024 - mdpi.com
Pre-trained transformer models shine in many natural language processing tasks and
therefore are expected to bear the representation of the input sentence or text meaning …

How do languages influence each other? studying cross-lingual data sharing during LM fine-tuning

R Choenni, D Garrette, E Shutova - arxiv preprint arxiv:2305.13286, 2023 - arxiv.org
Multilingual large language models (MLLMs) are jointly trained on data from many different
languages such that representation of individual languages can benefit from other …

Data-driven cross-lingual syntax: An agreement study with massively multilingual models

AG Varda, M Marelli - Computational Linguistics, 2023 - direct.mit.edu
Massively multilingual models such as mBERT and XLM-R are increasingly valued in
Natural Language Processing research and applications, due to their ability to tackle the …

Analyzing the mono-and cross-lingual pretraining dynamics of multilingual language models

T Blevins, H Gonen, L Zettlemoyer - arxiv preprint arxiv:2205.11758, 2022 - arxiv.org
The emergent cross-lingual transfer seen in multilingual pretrained models has sparked
significant interest in studying their behavior. However, because these analyses have …

Discovering language-neutral sub-networks in multilingual language models

N Foroutan, M Banaei, R Lebret, A Bosselut… - arxiv preprint arxiv …, 2022 - arxiv.org
Multilingual pre-trained language models transfer remarkably well on cross-lingual
downstream tasks. However, the extent to which they learn language-neutral …

Discovering low-rank subspaces for language-agnostic multilingual representations

Z **e, H Zhao, T Yu, S Li - arxiv preprint arxiv:2401.05792, 2024 - arxiv.org
Large pretrained multilingual language models (ML-LMs) have shown remarkable
capabilities of zero-shot cross-lingual transfer, without direct cross-lingual supervision. While …

Differential privacy, linguistic fairness, and training data influence: Impossibility and possibility theorems for multilingual language models

P Rust, A Søgaard - International Conference on Machine …, 2023 - proceedings.mlr.press
Abstract Language models such as mBERT, XLM-R, and BLOOM aim to achieve
multilingual generalization or compression to facilitate transfer to a large number of …

BioBERTurk: exploring Turkish biomedical language model development strategies in low-resource setting

H Türkmen, O Dikenelli, C Eraslan, MC Callı… - Journal of Healthcare …, 2023 - Springer
Pretrained language models augmented with in-domain corpora show impressive results in
biomedicine and clinical Natural Language Processing (NLP) tasks in English. However …