Neural machine translation: A review
The field of machine translation (MT), the automatic translation of written text from one
natural language into another, has experienced a major paradigm shift in recent years …
natural language into another, has experienced a major paradigm shift in recent years …
Generative Artificial Intelligence for Software Engineering--A Research Agenda
Generative Artificial Intelligence (GenAI) tools have become increasingly prevalent in
software development, offering assistance to various managerial and technical project …
software development, offering assistance to various managerial and technical project …
Starcoder: may the source be with you!
The BigCode community, an open-scientific collaboration working on the responsible
development of Large Language Models for Code (Code LLMs), introduces StarCoder and …
development of Large Language Models for Code (Code LLMs), introduces StarCoder and …
Unsupervised cross-lingual representation learning for speech recognition
This paper presents XLSR which learns cross-lingual speech representations by pretraining
a single model from the raw waveform of speech in multiple languages. We build on …
a single model from the raw waveform of speech in multiple languages. We build on …
vq-wav2vec: Self-supervised learning of discrete speech representations
We propose vq-wav2vec to learn discrete representations of audio segments through a
wav2vec-style self-supervised context prediction task. The algorithm uses either a gumbel …
wav2vec-style self-supervised context prediction task. The algorithm uses either a gumbel …
wav2vec: Unsupervised pre-training for speech recognition
We explore unsupervised pre-training for speech recognition by learning representations of
raw audio. wav2vec is trained on large amounts of unlabeled audio data and the resulting …
raw audio. wav2vec is trained on large amounts of unlabeled audio data and the resulting …
Large-scale evidence for logarithmic effects of word predictability on reading time
During real-time language comprehension, our minds rapidly decode complex meanings
from sequences of words. The difficulty of doing so is known to be related to words' …
from sequences of words. The difficulty of doing so is known to be related to words' …
BLiMP: The benchmark of linguistic minimal pairs for English
Abstract We introduce The Benchmark of Linguistic Minimal Pairs (BLiMP), a challenge set
for evaluating the linguistic knowledge of language models (LMs) on major grammatical …
for evaluating the linguistic knowledge of language models (LMs) on major grammatical …
Learning to ask: Neural question generation for reading comprehension
We study automatic question generation for sentences from text passages in reading
comprehension. We introduce an attention-based sequence learning model for the task and …
comprehension. We introduce an attention-based sequence learning model for the task and …
[HTML][HTML] Deep speech 2: End-to-end speech recognition in english and mandarin
We show that an end-to-end deep learning approach can be used to recognize either
English or Mandarin Chinese speech–two vastly different languages. Because it replaces …
English or Mandarin Chinese speech–two vastly different languages. Because it replaces …