Gradient vaccine: Investigating and improving multi-task optimization in massively multilingual models
Massively multilingual models subsuming tens or even hundreds of languages pose great
challenges to multi-task optimization. While it is a common practice to apply a language …
challenges to multi-task optimization. While it is a common practice to apply a language …
On negative interference in multilingual models: Findings and a meta-learning treatment
Modern multilingual models are trained on concatenated text from multiple languages in
hopes of conferring benefits to each (positive transfer), with the most pronounced benefits …
hopes of conferring benefits to each (positive transfer), with the most pronounced benefits …
MulDA: A multilingual data augmentation framework for low-resource cross-lingual NER
Abstract Named Entity Recognition (NER) for low-resource languages is a both practical and
challenging research problem. This paper addresses zero-shot transfer for cross-lingual …
challenging research problem. This paper addresses zero-shot transfer for cross-lingual …
A primer on pretrained multilingual language models
Multilingual Language Models (\MLLMs) such as mBERT, XLM, XLM-R,\textit {etc.} have
emerged as a viable option for bringing the power of pretraining to a large number of …
emerged as a viable option for bringing the power of pretraining to a large number of …
Explicit alignment objectives for multilingual bidirectional encoders
Pre-trained cross-lingual encoders such as mBERT (Devlin et al., 2019) and XLMR
(Conneau et al., 2020) have proven to be impressively effective at enabling transfer-learning …
(Conneau et al., 2020) have proven to be impressively effective at enabling transfer-learning …
Cross-modal generalization: Learning in low resource modalities via meta-alignment
How can we generalize to a new prediction task at test time when it also uses a new
modality as input? More importantly, how can we do this with as little annotated data as …
modality as input? More importantly, how can we do this with as little annotated data as …
Cross-lingual alignment methods for multilingual BERT: A comparative study
Multilingual BERT (mBERT) has shown reasonable capability for zero-shot cross-lingual
transfer when fine-tuned on downstream tasks. Since mBERT is not pre-trained with explicit …
transfer when fine-tuned on downstream tasks. Since mBERT is not pre-trained with explicit …
Investigating Unsupervised Neural Machine Translation for Low-resource Language Pair English-Mizo via Lexically Enhanced Pre-trained Language Models
The vast majority of languages in the world at present are considered to be low-resource
languages. Since the availability of large parallel data is crucial for the success of most …
languages. Since the availability of large parallel data is crucial for the success of most …
Breaking the script barrier in multilingual pre-trained language models with transliteration-based post-training alignment
Multilingual pre-trained models (mPLMs) have shown impressive performance on cross-
lingual transfer tasks. However, the transfer performance is often hindered when a low …
lingual transfer tasks. However, the transfer performance is often hindered when a low …
Model and data transfer for cross-lingual sequence labelling in zero-resource settings
Zero-resource cross-lingual transfer approaches aim to apply supervised models from a
source language to unlabelled target languages. In this paper we perform an in-depth study …
source language to unlabelled target languages. In this paper we perform an in-depth study …