A brief overview of universal sentence representation methods: A linguistic view

R Li, X Zhao, MF Moens - ACM Computing Surveys (CSUR), 2022 - dl.acm.org
How to transfer the semantic information in a sentence to a computable numerical
embedding form is a fundamental problem in natural language processing. An informative …

Cyberbullying detection: Hybrid models based on machine learning and natural language processing techniques

C Raj, A Agarwal, G Bharathy, B Narayan, M Prasad - Electronics, 2021 - mdpi.com
The rise in web and social media interactions has resulted in the efortless proliferation of
offensive language and hate speech. Such online harassment, insults, and attacks are …

A survey on neural word embeddings

E Sezerer, S Tekir - arxiv preprint arxiv:2110.01804, 2021 - arxiv.org
Understanding human language has been a sub-challenge on the way of intelligent
machines. The study of meaning in natural language processing (NLP) relies on the …

Modeling language variation and universals: A survey on typological linguistics for natural language processing

EM Ponti, H O'horan, Y Berzak, I Vulić… - Computational …, 2019 - direct.mit.edu
Linguistic typology aims to capture structural and semantic variation across the world's
languages. A large-scale typology could provide excellent guidance for multilingual Natural …

[Књига][B] Distributional semantics

A Lenci, M Sahlgren - 2023 - books.google.com
Distributional semantics develops theories and methods to represent the meaning of natural
language expressions, with vectors encoding their statistical distribution in linguistic …

ClaimRank: Detecting check-worthy claims in Arabic and English

I Jaradat, P Gencheva, A Barrón-Cedeño… - arxiv preprint arxiv …, 2018 - arxiv.org
We present ClaimRank, an online system for detecting check-worthy claims. While originally
trained on political debates, the system can work for any kind of text, eg, interviews or …

Specialising word vectors for lexical entailment

I Vulić, N Mrkšić - arxiv preprint arxiv:1710.06371, 2017 - arxiv.org
We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that
transforms any input word vector space to emphasise the asymmetric relation of lexical …

Concatenated power mean word embeddings as universal cross-lingual sentence representations

A Rücklé, S Eger, M Peyrard, I Gurevych - arxiv preprint arxiv:1803.01400, 2018 - arxiv.org
Average word embeddings are a common baseline for more sophisticated sentence
embedding techniques. However, they typically fall short of the performances of more …

Hyperlex: A large-scale evaluation of graded lexical entailment

I Vulić, D Gerz, D Kiela, F Hill… - Computational Linguistics, 2017 - direct.mit.edu
We introduce HyperLex—a data set and evaluation resource that quantifies the extent of the
semantic category membership, that is, type-of relation, also known as hyponymy …

Explicit retrofitting of distributional word vectors

G Glavaš, I Vulić - Proceedings of the 56th Annual Meeting of the …, 2018 - aclanthology.org
Semantic specialization of distributional word vectors, referred to as retrofitting, is a process
of fine-tuning word vectors using external lexical knowledge in order to better embed some …