ParaNMT-50M: Pushing the limits of paraphrastic sentence embeddings with millions of machine translations

J Wieting, K Gimpel - arxiv preprint arxiv:1711.05732, 2017 - arxiv.org
We describe PARANMT-50M, a dataset of more than 50 million English-English sentential
paraphrase pairs. We generated the pairs automatically by using neural machine translation …

[PDF][PDF] Multi-perspective sentence similarity modeling with convolutional neural networks

H He, K Gimpel, J Lin - Proceedings of the 2015 conference on …, 2015 - aclanthology.org
Modeling sentence similarity is complicated by the ambiguity and variability of linguistic
expression. To cope with these challenges, we propose a model for comparing sentences …

[PDF][PDF] That's so annoying!!!: A lexical and frame-semantic embedding based data augmentation approach to automatic categorization of annoying behaviors using# …

WY Wang, D Yang - Proceedings of the 2015 conference on …, 2015 - aclanthology.org
We propose a novel data augmentation approach to enhance computational behavioral
analysis using social media text. In particular, we collect a Twitter corpus of the descriptions …

[PDF][PDF] Pairwise word interaction modeling with deep neural networks for semantic similarity measurement

H He, J Lin - Proceedings of the 2016 conference of the north …, 2016 - aclanthology.org
Textual similarity measurement is a challenging problem, as it requires understanding the
semantics of input sentences. Most previous neural network models use coarse-grained …

Paraphrasing revisited with neural machine translation

J Mallinson, R Sennrich, M Lapata - … of the 15th Conference of the …, 2017 - aclanthology.org
Recognizing and generating paraphrases is an important component in many natural
language processing applications. A well-established technique for automatically extracting …

A continuously growing dataset of sentential paraphrases

W Lan, S Qiu, H He, W Xu - arxiv preprint arxiv:1708.00391, 2017 - arxiv.org
A major challenge in paraphrase research is the lack of parallel corpora. In this paper, we
present a new method to collect large-scale sentential paraphrases from Twitter by linking …

Neural network models for paraphrase identification, semantic textual similarity, natural language inference, and question answering

W Lan, W Xu - arxiv preprint arxiv:1806.04330, 2018 - arxiv.org
In this paper, we analyze several neural network designs (and their variations) for sentence
pair modeling and compare their performance extensively across eight datasets, including …

The bq corpus: A large-scale domain-specific chinese corpus for sentence semantic equivalence identification

J Chen, Q Chen, X Liu, H Yang, D Lu… - Proceedings of the 2018 …, 2018 - aclanthology.org
This paper introduces the Bank Question (BQ) corpus, a Chinese corpus for sentence
semantic equivalence identification (SSEI). The BQ corpus contains 120,000 question pairs …

Multiple instance learning networks for fine-grained sentiment analysis

S Angelidis, M Lapata - Transactions of the Association for …, 2018 - direct.mit.edu
We consider the task of fine-grained sentiment analysis from the perspective of multiple
instance learning (MIL). Our neural model is trained on document sentiment labels, and …

A deep network model for paraphrase detection in short text messages

B Agarwal, H Ramampiaro, H Langseth… - Information Processing & …, 2018 - Elsevier
This paper is concerned with paraphrase detection, ie, identifying sentences that are
semantically identical. The ability to detect similar sentences written in natural language is …