Finetuned language models are zero-shot learners

J Wei, M Bosma, VY Zhao, K Guu, AW Yu… - arxiv preprint arxiv …, 2021 - arxiv.org
This paper explores a simple method for improving the zero-shot learning abilities of
language models. We show that instruction tuning--finetuning language models on a …

Domain-specific language model pretraining for biomedical natural language processing

Y Gu, R Tinn, H Cheng, M Lucas, N Usuyama… - ACM Transactions on …, 2021 - dl.acm.org
Pretraining large neural language models, such as BERT, has led to impressive gains on
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …

[KNIHA][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Neural unsupervised domain adaptation in NLP---a survey

A Ramponi, B Plank - arxiv preprint arxiv:2006.00632, 2020 - arxiv.org
Deep neural networks excel at learning from labeled data and achieve state-of-the-art
resultson a wide array of Natural Language Processing tasks. In contrast, learning from …

Achieving human parity on automatic chinese to english news translation

H Hassan, A Aue, C Chen, V Chowdhary… - arxiv preprint arxiv …, 2018 - arxiv.org
Machine translation has made rapid advances in recent years. Millions of people are using it
today in online translation systems and mobile applications in order to communicate across …

Unsupervised domain clusters in pretrained language models

R Aharoni, Y Goldberg - arxiv preprint arxiv:2004.02105, 2020 - arxiv.org
The notion of" in-domain data" in NLP is often over-simplistic and vague, as textual data
varies in many nuanced linguistic aspects such as topic, style or level of formality. In …

Using the output embedding to improve language models

O Press, L Wolf - arxiv preprint arxiv:1608.05859, 2016 - arxiv.org
We study the topmost weight matrix of neural network language models. We show that this
matrix constitutes a valid word embedding. When training language models, we recommend …

[PDF][PDF] Neural machine translation by jointly learning to align and translate

D Bahdanau, K Cho, Y Bengio - arxiv preprint arxiv:1409.0473, 2014 - peerj.com
Neural machine translation is a recently proposed approach to machine translation. Unlike
the traditional statistical machine translation, the neural machine translation aims at building …

Learning phrase representations using RNN encoder-decoder for statistical machine translation

K Cho, B Van Merriënboer, C Gulcehre… - arxiv preprint arxiv …, 2014 - arxiv.org
In this paper, we propose a novel neural network model called RNN Encoder-Decoder that
consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols …

A survey of domain adaptation for machine translation

C Chu, R Wang - Journal of information processing, 2020 - jstage.jst.go.jp
Neural machine translation (NMT) is a deep learning based approach for machine
translation, which outperforms traditional statistical machine translation (SMT) and yields the …