Pre-trained models for natural language processing: A survey
Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …
A survey on deep learning for named entity recognition
Named entity recognition (NER) is the task to identify mentions of rigid designators from text
belonging to predefined semantic types such as person, location, organization etc. NER …
belonging to predefined semantic types such as person, location, organization etc. NER …
An empirical evaluation of generic convolutional and recurrent networks for sequence modeling
For most deep learning practitioners, sequence modeling is synonymous with recurrent
networks. Yet recent results indicate that convolutional architectures can outperform …
networks. Yet recent results indicate that convolutional architectures can outperform …
On the variance of the adaptive learning rate and beyond
The learning rate warmup heuristic achieves remarkable success in stabilizing training,
accelerating convergence and improving generalization for adaptive stochastic optimization …
accelerating convergence and improving generalization for adaptive stochastic optimization …
Reducing transformer depth on demand with structured dropout
Overparameterized transformer networks have obtained state of the art results in various
natural language processing tasks, such as machine translation, language modeling, and …
natural language processing tasks, such as machine translation, language modeling, and …
MoverScore: Text generation evaluating with contextualized embeddings and earth mover distance
A robust evaluation metric has a profound impact on the development of text generation
systems. A desirable metric compares system output against references based on their …
systems. A desirable metric compares system output against references based on their …
Dissecting contextual word embeddings: Architecture and representation
Contextual word representations derived from pre-trained bidirectional language models
(biLMs) have recently been shown to provide significant improvements to the state of the art …
(biLMs) have recently been shown to provide significant improvements to the state of the art …
Structured pruning of large language models
Large language models have recently achieved state of the art performance across a wide
variety of natural language tasks. Meanwhile, the size of these models and their latency …
variety of natural language tasks. Meanwhile, the size of these models and their latency …
Named entity extraction for knowledge graphs: A literature overview
An enormous amount of digital information is expressed as natural-language (NL) text that is
not easily processable by computers. Knowledge Graphs (KG) offer a widely used format for …
not easily processable by computers. Knowledge Graphs (KG) offer a widely used format for …
Framework for deep learning-based language models using multi-task learning in natural language understanding: A systematic literature review and future directions
Learning human languages is a difficult task for a computer. However, Deep Learning (DL)
techniques have enhanced performance significantly for almost all-natural language …
techniques have enhanced performance significantly for almost all-natural language …