Modular deep learning

J Pfeiffer, S Ruder, I Vulić, EM Ponti - arxiv preprint arxiv:2302.11529, 2023 - arxiv.org
Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …

Adaptersoup: Weight averaging to improve generalization of pretrained language models

A Chronopoulou, ME Peters, A Fraser… - arxiv preprint arxiv …, 2023 - arxiv.org
Pretrained language models (PLMs) are trained on massive corpora, but often need to
specialize to specific domains. A parameter-efficient adaptation method suggests training an …

On the domain adaptation and generalization of pretrained language models: A survey

X Guo, H Yu - arxiv preprint arxiv:2211.03154, 2022 - arxiv.org
Recent advances in NLP are brought by a range of large-scale pretrained language models
(PLMs). These PLMs have brought significant performance gains for a range of NLP tasks …

mmt5: Modular multilingual pre-training solves source language hallucinations

J Pfeiffer, F Piccinno, M Nicosia, X Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Multilingual sequence-to-sequence models perform poorly with increased language
coverage and fail to consistently generate text in the correct target language in few-shot …

M2qa: Multi-domain multilingual question answering

L Engländer, H Sterz, C Poth, J Pfeiffer… - arxiv preprint arxiv …, 2024 - arxiv.org
Generalization and robustness to input variation are core desiderata of machine learning
research. Language varies along several axes, most importantly, language instance (eg …

: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter

W Lai, A Chronopoulou, A Fraser - arxiv preprint arxiv:2210.11912, 2022 - arxiv.org
Multilingual neural machine translation models (MNMT) yield state-of-the-art performance
when evaluated on data from a domain and language pair seen at training time. However …

Multilingual pre-training with language and task adaptation for multilingual text style transfer

H Lai, A Toral, M Nissim - arxiv preprint arxiv:2203.08552, 2022 - arxiv.org
We exploit the pre-trained seq2seq model mBART for multilingual text style transfer. Using
machine translated data as well as gold aligned English sentences yields state-of-the-art …

Attribute injection for pretrained language models: A new benchmark and an efficient method

RK Amplayo, KM Yoo, SW Lee - Proceedings of the 29th …, 2022 - aclanthology.org
Metadata attributes (eg, user and product IDs from reviews) can be incorporated as
additional inputs to neural-based NLP models, by expanding the architecture of the models …

Domain generalisation of NMT: Fusing adapters with leave-one-domain-out training

TT Vu, S Khadivi, D Phung… - Annual Meeting of the …, 2022 - research.monash.edu
Generalising to unseen domains is under-explored and remains a challenge in neural
machine translation. Inspired by recent research in parameter-efficient transfer learning from …

Towards Engineered Safe AI with Modular Concept Models

L Heidemann, I Kurzidem, M Monnet… - Proceedings of the …, 2024 - openaccess.thecvf.com
The inherent complexity and uncertainty of Machine Learning (ML) makes it difficult for ML-
based Computer Vision (CV) approaches to become prevalent in safety-critical domains like …