Machine knowledge: Creation and curation of comprehensive knowledge bases

G Weikum, XL Dong, S Razniewski… - … and Trends® in …, 2021‏ - nowpublishers.com
Equip** machines with comprehensive knowledge of the world's entities and their
relationships has been a longstanding goal of AI. Over the last decade, large-scale …

Academic plagiarism detection: a systematic literature review

T Foltýnek, N Meuschke, B Gipp - ACM Computing Surveys (CSUR), 2019‏ - dl.acm.org
This article summarizes the research on computational methods to detect academic
plagiarism by systematically reviewing 239 research papers published between 2013 and …

Auggpt: Leveraging chatgpt for text data augmentation

H Dai, Z Liu, W Liao, X Huang, Y Cao… - … Transactions on Big …, 2025‏ - ieeexplore.ieee.org
Text data augmentation is an effective strategy for overcoming the challenge of limited
sample sizes in many natural language processing (NLP) tasks. This challenge is especially …

A survey of data augmentation approaches for NLP

SY Feng, V Gangal, J Wei, S Chandar… - arxiv preprint arxiv …, 2021‏ - arxiv.org
Data augmentation has recently seen increased interest in NLP due to more work in low-
resource domains, new tasks, and the popularity of large-scale neural networks that require …

[كتاب][B] Neural network methods in natural language processing

Y Goldberg - 2017‏ - books.google.com
Neural networks are a family of powerful machine learning models and this book focuses on
their application to natural language data. The first half of the book (Parts I and II) covers the …

A large annotated corpus for learning natural language inference

SR Bowman, G Angeli, C Potts, CD Manning - arxiv preprint arxiv …, 2015‏ - arxiv.org
Understanding entailment and contradiction is fundamental to understanding natural
language, and inference about entailment and contradiction is a valuable testing ground for …

Towards universal paraphrastic sentence embeddings

J Wieting, M Bansal, K Gimpel, K Livescu - arxiv preprint arxiv …, 2015‏ - arxiv.org
We consider the problem of learning general-purpose, paraphrastic sentence embeddings
based on supervision from the Paraphrase Database (Ganitkevitch et al., 2013). We …

Reside: Improving distantly-supervised neural relation extraction using side information

S Vashishth, R Joshi, SS Prayaga… - arxiv preprint arxiv …, 2018‏ - arxiv.org
Distantly-supervised Relation Extraction (RE) methods train an extractor by automatically
aligning relation instances in a Knowledge Base (KB) with unstructured text. In addition to …

From word to sense embeddings: A survey on vector representations of meaning

J Camacho-Collados, MT Pilehvar - Journal of Artificial Intelligence …, 2018‏ - jair.org
Over the past years, distributed semantic representations have proved to be effective and
flexible keepers of prior knowledge to be integrated into downstream applications. This …

Counter-fitting word vectors to linguistic constraints

N Mrkšić, DO Séaghdha, B Thomson, M Gašić… - arxiv preprint arxiv …, 2016‏ - arxiv.org
In this work, we present a novel counter-fitting method which injects antonymy and
synonymy constraints into vector space representations in order to improve the vectors' …