Can transformer be too compositional? analysing idiom processing in neural machine translation

V Dankers, CG Lucas, I Titov - arxiv preprint arxiv:2205.15301, 2022 - arxiv.org
Unlike literal expressions, idioms' meanings do not directly follow from their parts, posing a
challenge for neural machine translation (NMT). NMT models are often unable to translate …

Idiomatic expression identification using semantic compatibility

Z Zeng, S Bhat - Transactions of the Association for Computational …, 2021 - direct.mit.edu
Idiomatic expressions are an integral part of natural language and constantly being added to
a language. Owing to their non-compositionality and their ability to take on a figurative or …

Does BERT understand idioms? A probing-based empirical study of BERT encodings of idioms

M Tan, J Jiang - 2021 - ink.library.smu.edu.sg
Understanding idioms is important in NLP. In this paper, we study to what extent pre-trained
BERT model can encode the meaning of a potentially idiomatic expression (PIE) in a certain …

Investigating Idiomaticity in Word Representations

W He, TK Vieira, M Garcia, C Scarton, M Idiart… - Computational …, 2024 - direct.mit.edu
Idiomatic expressions are an integral part of human languages, often used to express
complex ideas in compressed or conventional ways (eg eager beaver as a keen and …

Data-driven identification of idioms in song lyrics

M Amin, P Fankhauser, M Kupietz… - Proceedings of the 17th …, 2021 - aclanthology.org
The automatic recognition of idioms poses a challenging problem for NLP applications.
Whereas native speakers can intuitively handle multiword expressions whose compositional …

Contextualized embeddings encode monolingual and cross-lingual knowledge of idiomaticity

S Fakharian, P Cook - Proceedings of the 17th workshop on …, 2021 - aclanthology.org
Potentially idiomatic expressions (PIEs) are ambiguous between non-compositional
idiomatic interpretations and transparent literal interpretations. For example,“hit the road” …

Leveraging three types of embeddings from masked language models in idiom token classification

R Takahashi, R Sasano, K Takeda - Proceedings of the 11th Joint …, 2022 - aclanthology.org
Many linguistic expressions have idiomatic and literal interpretations, and the automatic
distinction of these two interpretations has been studied for decades. Recent research has …

MWE as WSD: Solving Multiword Expression Identification with Word Sense Disambiguation

J Tanner, J Hoffman - arxiv preprint arxiv:2303.06623, 2023 - arxiv.org
Recent approaches to word sense disambiguation (WSD) utilize encodings of the sense
gloss (definition), in addition to the input context, to improve performance. In this work we …

CLIX: Cross-Lingual Explanations of Idiomatic Expressions

A Gluck, K von der Wense, M Pacheco - arxiv preprint arxiv:2501.03191, 2025 - arxiv.org
Automated definition generation systems have been proposed to support vocabulary
expansion for language learners. The main barrier to the success of these systems is that …

Ner4id at semeval-2022 task 2: Named entity recognition for idiomaticity detection

S Tedeschi, R Navigli - … of the 16th International Workshop on …, 2022 - iris.uniroma1.it
Idioms are lexically-complex phrases whose meaning cannot be derived by compositionally
interpreting their components. Although the automatic identification and understanding of …