A survey of data augmentation approaches for NLP
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …
resource domains, new tasks, and the popularity of large-scale neural networks that require …
Universal language model fine-tuning for text classification
Inductive transfer learning has greatly impacted computer vision, but existing approaches in
NLP still require task-specific modifications and training from scratch. We propose Universal …
NLP still require task-specific modifications and training from scratch. We propose Universal …
An efficient framework for learning sentence representations
In this work we propose a simple and efficient framework for learning sentence
representations from unlabelled data. Drawing inspiration from the distributional hypothesis …
representations from unlabelled data. Drawing inspiration from the distributional hypothesis …
A brief overview of universal sentence representation methods: A linguistic view
How to transfer the semantic information in a sentence to a computable numerical
embedding form is a fundamental problem in natural language processing. An informative …
embedding form is a fundamental problem in natural language processing. An informative …
Semantically equivalent adversarial rules for debugging NLP models
Complex machine learning models for NLP are often brittle, making different predictions for
input instances that are extremely similar semantically. To automatically detect this behavior …
input instances that are extremely similar semantically. To automatically detect this behavior …
ParaNMT-50M: Pushing the limits of paraphrastic sentence embeddings with millions of machine translations
We describe PARANMT-50M, a dataset of more than 50 million English-English sentential
paraphrase pairs. We generated the pairs automatically by using neural machine translation …
paraphrase pairs. We generated the pairs automatically by using neural machine translation …
Sbert-wk: A sentence embedding method by dissecting bert-based word models
Sentence embedding is an important research topic in natural language processing (NLP)
since it can transfer knowledge to downstream tasks. Meanwhile, a contextualized word …
since it can transfer knowledge to downstream tasks. Meanwhile, a contextualized word …
[BOOK][B] Text data mining
With the rapid development and popularization of Internet and mobile communication
technologies, text data mining has attracted much attention. In particular, with the wide use …
technologies, text data mining has attracted much attention. In particular, with the wide use …
Beyond BLEU: training neural machine translation with semantic similarity
While most neural machine translation (NMT) systems are still trained using maximum
likelihood estimation, recent work has demonstrated that optimizing systems to directly …
likelihood estimation, recent work has demonstrated that optimizing systems to directly …
Content selection in deep learning models of summarization
We carry out experiments with deep learning models of summarization across the domains
of news, personal stories, meetings, and medical articles in order to understand how content …
of news, personal stories, meetings, and medical articles in order to understand how content …