Modular deep learning
Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …
trained models fine-tuned for downstream tasks achieve better performance with fewer …
Language-specific neurons: The key to multilingual capabilities in large language models
Large language models (LLMs) demonstrate remarkable multilingual capabilities without
being pre-trained on specially curated multilingual parallel corpora. It remains a challenging …
being pre-trained on specially curated multilingual parallel corpora. It remains a challenging …
Language embeddings sometimes contain typological generalizations
To what extent can neural network models learn generalizations about language structure,
and how do we find out what they have learned? We explore these questions by training …
and how do we find out what they have learned? We explore these questions by training …
The role of typological feature prediction in NLP and linguistics
J Bjerva - Computational Linguistics, 2024 - direct.mit.edu
Computational typology has gained traction in the field of Natural Language Processing
(NLP) in recent years, as evidenced by the increasing number of papers on the topic and the …
(NLP) in recent years, as evidenced by the increasing number of papers on the topic and the …
Data-driven cross-lingual syntax: An agreement study with massively multilingual models
Massively multilingual models such as mBERT and XLM-R are increasingly valued in
Natural Language Processing research and applications, due to their ability to tackle the …
Natural Language Processing research and applications, due to their ability to tackle the …
Phylogeny-inspired adaptation of multilingual models to new languages
Large pretrained multilingual models, trained on dozens of languages, have delivered
promising results due to cross-lingual learning capabilities on variety of language tasks …
promising results due to cross-lingual learning capabilities on variety of language tasks …
On the Multilingual Ability of Decoder-based Pre-trained Language Models: Finding and Controlling Language-Specific Neurons
Current decoder-based pre-trained language models (PLMs) successfully demonstrate
multilingual capabilities. However, it is unclear how these models handle multilingualism …
multilingual capabilities. However, it is unclear how these models handle multilingualism …
Multilingual Speech Models for Automatic Speech Recognition Exhibit Gender Performance Gaps
Current voice recognition approaches use multi-task, multilingual models for speech tasks
like Automatic Speech Recognition (ASR) to make them applicable to many languages …
like Automatic Speech Recognition (ASR) to make them applicable to many languages …
Interpreting arithmetic mechanism in large language models through comparative neuron analysis
We find arithmetic ability resides within a limited number of attention heads, with each head
specializing in distinct operations. To delve into the reason, we introduce the Comparative …
specializing in distinct operations. To delve into the reason, we introduce the Comparative …
Causal analysis of syntactic agreement neurons in multilingual language models
Structural probing work has found evidence for latent syntactic information in pre-trained
language models. However, much of this analysis has focused on monolingual models, and …
language models. However, much of this analysis has focused on monolingual models, and …