Neural unsupervised domain adaptation in NLP---a survey

A Ramponi, B Plank - arxiv preprint arxiv:2006.00632, 2020‏ - arxiv.org
Deep neural networks excel at learning from labeled data and achieve state-of-the-art
resultson a wide array of Natural Language Processing tasks. In contrast, learning from …

XTREME-R: Towards more challenging and nuanced multilingual evaluation

S Ruder, N Constant, J Botha, A Siddhant… - arxiv preprint arxiv …, 2021‏ - arxiv.org
Machine learning has brought striking advances in multilingual natural language processing
capabilities over the past year. For example, the latest techniques have improved the state …

The AI doctor is in: A survey of task-oriented dialogue systems for healthcare applications

M Valizadeh, N Parde - Proceedings of the 60th Annual Meeting …, 2022‏ - aclanthology.org
Task-oriented dialogue systems are increasingly prevalent in healthcare settings, and have
been characterized by a diverse range of architectures and objectives. Although these …

Language ID in the wild: Unexpected challenges on the path to a thousand-language web text corpus

I Caswell, T Breiner, D Van Esch, A Bapna - arxiv preprint arxiv …, 2020‏ - arxiv.org
Large text corpora are increasingly important for a wide variety of Natural Language
Processing (NLP) tasks, and automatic language identification (LangID) is a core technology …

How linguistically fair are multilingual pre-trained language models?

M Choudhury, A Deshpande - Proceedings of the AAAI conference on …, 2021‏ - ojs.aaai.org
Massively multilingual pre-trained language models, such as mBERT and XLM-RoBERTa,
have received significant attention in the recent NLP literature for their excellent capability …

A closer look at few-shot crosslingual transfer: The choice of shots matters

M Zhao, Y Zhu, E Shareghi, I Vulić, R Reichart… - arxiv preprint arxiv …, 2020‏ - arxiv.org
Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with
pretrained encoders like multilingual BERT. Despite its growing popularity, little to no …

Improving word translation via two-stage contrastive learning

Y Li, F Liu, N Collier, A Korhonen, I Vulić - arxiv preprint arxiv:2203.08307, 2022‏ - arxiv.org
Word translation or bilingual lexicon induction (BLI) is a key cross-lingual task, aiming to
bridge the lexical gap between different languages. In this work, we propose a robust and …

Learning causal representations for robust domain adaptation

S Yang, K Yu, F Cao, L Liu, H Wang… - IEEE Transactions on …, 2021‏ - ieeexplore.ieee.org
In this study, we investigate a challenging problem, namely, robust domain adaptation,
where data from only a single well-labeled source domain are available in the training …

English contrastive learning can learn universal cross-lingual sentence embeddings

YS Wang, A Wu, G Neubig - arxiv preprint arxiv:2211.06127, 2022‏ - arxiv.org
Universal cross-lingual sentence embeddings map semantically similar cross-lingual
sentences into a shared embedding space. Aligning cross-lingual sentence embeddings …

XL-WiC: A multilingual benchmark for evaluating semantic contextualization

A Raganato, T Pasini… - Proceedings of the …, 2020‏ - aclanthology.org
The ability to correctly model distinct meanings of a word is crucial for the effectiveness of
semantic representation techniques. However, most existing evaluation benchmarks for …