Google's multilingual neural machine translation system: Enabling zero-shot translation

M Johnson, M Schuster, QV Le, M Krikun… - Transactions of the …, 2017 - direct.mit.edu
We propose a simple solution to use a single Neural Machine Translation (NMT) model to
translate between multiple languages. Our solution requires no changes to the model …

An empirical survey of data augmentation for limited data learning in nlp

J Chen, D Tam, C Raffel, M Bansal… - Transactions of the …, 2023 - direct.mit.edu
NLP has achieved great progress in the past decade through the use of neural models and
large labeled datasets. The dependence on abundant data prevents NLP models from being …

A survey of cross-lingual word embedding models

S Ruder, I Vulić, A Søgaard - Journal of Artificial Intelligence Research, 2019 - jair.org
Cross-lingual representations of words enable us to reason about word meaning in
multilingual contexts and are a key facilitator of cross-lingual transfer when develo** …

Fully character-level neural machine translation without explicit segmentation

J Lee, K Cho, T Hofmann - Transactions of the Association for …, 2017 - direct.mit.edu
Most existing machine translation systems operate at the level of words, relying on explicit
segmentation to extract tokens. We introduce a neural machine translation (NMT) model that …

Evaluating gpt-4 and chatgpt on japanese medical licensing examinations

J Kasai, Y Kasai, K Sakaguchi, Y Yamada… - arxiv preprint arxiv …, 2023 - arxiv.org
As large language models (LLMs) gain popularity among speakers of diverse languages,
we believe that it is crucial to benchmark them to better understand model behaviors …

Choosing transfer languages for cross-lingual learning

YH Lin, CY Chen, J Lee, Z Li, Y Zhang, M **a… - arxiv preprint arxiv …, 2019 - arxiv.org
Cross-lingual transfer, where a high-resource transfer language is used to improve the
accuracy of a low-resource task language, is now an invaluable tool for improving …

Modeling language variation and universals: A survey on typological linguistics for natural language processing

EM Ponti, H O'horan, Y Berzak, I Vulić… - Computational …, 2019 - direct.mit.edu
Linguistic typology aims to capture structural and semantic variation across the world's
languages. A large-scale typology could provide excellent guidance for multilingual Natural …

Multi-simlex: A large-scale evaluation of multilingual and crosslingual lexical semantic similarity

I Vulić, S Baker, EM Ponti, U Petti, I Leviant… - Computational …, 2020 - direct.mit.edu
Abstract We introduce Multi-SimLex, a large-scale lexical resource and evaluation
benchmark covering data sets for 12 typologically diverse languages, including major …

Cross-lingual learning for text processing: A survey

M Pikuliak, M Šimko, M Bieliková - Expert Systems with Applications, 2021 - Elsevier
Many intelligent systems in business, government or academy process natural language as
an input during inference or they might even communicate with users in natural language …

Lost in translation: large language models in non-English content analysis

G Nicholas, A Bhatia - arxiv preprint arxiv:2306.07377, 2023 - arxiv.org
In recent years, large language models (eg, Open AI's GPT-4, Meta's LLaMa, Google's
PaLM) have become the dominant approach for building AI systems to analyze and …