Pre-trained models for natural language processing: A survey
Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …
A survey on deep learning for named entity recognition
Named entity recognition (NER) is the task to identify mentions of rigid designators from text
belonging to predefined semantic types such as person, location, organization etc. NER …
belonging to predefined semantic types such as person, location, organization etc. NER …
Data2vec: A general framework for self-supervised learning in speech, vision and language
While the general idea of self-supervised learning is identical across modalities, the actual
algorithms and objectives differ widely because they were developed with a single modality …
algorithms and objectives differ widely because they were developed with a single modality …
XLS-R: Self-supervised cross-lingual speech representation learning at scale
This paper presents XLS-R, a large-scale model for cross-lingual speech representation
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …
learning based on wav2vec 2.0. We train models with up to 2B parameters on nearly half a …
LUKE: Deep contextualized entity representations with entity-aware self-attention
Entity representations are useful in natural language tasks involving entities. In this paper,
we propose new pretrained contextualized representations of words and entities based on …
we propose new pretrained contextualized representations of words and entities based on …
Less training, more repairing please: revisiting automated program repair via zero-shot learning
Due to the promising future of Automated Program Repair (APR), researchers have
proposed various APR techniques, including heuristic-based, template-based, and …
proposed various APR techniques, including heuristic-based, template-based, and …
Don't stop pretraining: Adapt language models to domains and tasks
Language models pretrained on text from a wide variety of sources form the foundation of
today's NLP. In light of the success of these broad-coverage models, we investigate whether …
today's NLP. In light of the success of these broad-coverage models, we investigate whether …
Efficient self-supervised learning with contextualized target representations for vision, speech and language
Current self-supervised learning algorithms are often modality-specific and require large
amounts of computational resources. To address these issues, we increase the training …
amounts of computational resources. To address these issues, we increase the training …
A primer in BERTology: What we know about how BERT works
Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …
understanding of what is behind their success is still limited. This paper is the first survey of …
Minilm: Deep self-attention distillation for task-agnostic compression of pre-trained transformers
Pre-trained language models (eg, BERT (Devlin et al., 2018) and its variants) have achieved
remarkable success in varieties of NLP tasks. However, these models usually consist of …
remarkable success in varieties of NLP tasks. However, these models usually consist of …