Modular deep learning
Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …
trained models fine-tuned for downstream tasks achieve better performance with fewer …
Adaptersoup: Weight averaging to improve generalization of pretrained language models
Pretrained language models (PLMs) are trained on massive corpora, but often need to
specialize to specific domains. A parameter-efficient adaptation method suggests training an …
specialize to specific domains. A parameter-efficient adaptation method suggests training an …
On the domain adaptation and generalization of pretrained language models: A survey
Recent advances in NLP are brought by a range of large-scale pretrained language models
(PLMs). These PLMs have brought significant performance gains for a range of NLP tasks …
(PLMs). These PLMs have brought significant performance gains for a range of NLP tasks …
mmt5: Modular multilingual pre-training solves source language hallucinations
Multilingual sequence-to-sequence models perform poorly with increased language
coverage and fail to consistently generate text in the correct target language in few-shot …
coverage and fail to consistently generate text in the correct target language in few-shot …
M2qa: Multi-domain multilingual question answering
Generalization and robustness to input variation are core desiderata of machine learning
research. Language varies along several axes, most importantly, language instance (eg …
research. Language varies along several axes, most importantly, language instance (eg …
: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter
Multilingual neural machine translation models (MNMT) yield state-of-the-art performance
when evaluated on data from a domain and language pair seen at training time. However …
when evaluated on data from a domain and language pair seen at training time. However …
Multilingual pre-training with language and task adaptation for multilingual text style transfer
We exploit the pre-trained seq2seq model mBART for multilingual text style transfer. Using
machine translated data as well as gold aligned English sentences yields state-of-the-art …
machine translated data as well as gold aligned English sentences yields state-of-the-art …
Attribute injection for pretrained language models: A new benchmark and an efficient method
Metadata attributes (eg, user and product IDs from reviews) can be incorporated as
additional inputs to neural-based NLP models, by expanding the architecture of the models …
additional inputs to neural-based NLP models, by expanding the architecture of the models …
Domain generalisation of NMT: Fusing adapters with leave-one-domain-out training
Generalising to unseen domains is under-explored and remains a challenge in neural
machine translation. Inspired by recent research in parameter-efficient transfer learning from …
machine translation. Inspired by recent research in parameter-efficient transfer learning from …
Towards Engineered Safe AI with Modular Concept Models
The inherent complexity and uncertainty of Machine Learning (ML) makes it difficult for ML-
based Computer Vision (CV) approaches to become prevalent in safety-critical domains like …
based Computer Vision (CV) approaches to become prevalent in safety-critical domains like …