All bark and no bite: Rogue dimensions in transformer language models obscure representational quality
Similarity measures are a vital tool for understanding how language models represent and
process language. Standard representational similarity measures such as cosine similarity …
process language. Standard representational similarity measures such as cosine similarity …
Can transformer be too compositional? analysing idiom processing in neural machine translation
Unlike literal expressions, idioms' meanings do not directly follow from their parts, posing a
challenge for neural machine translation (NMT). NMT models are often unable to translate …
challenge for neural machine translation (NMT). NMT models are often unable to translate …
SemEval-2022 task 2: Multilingual idiomaticity detection and sentence embedding
This paper presents the shared task on Multilingual Idiomaticity Detection and Sentence
Embedding, which consists of two subtasks:(a) a binary classification task aimed at …
Embedding, which consists of two subtasks:(a) a binary classification task aimed at …
AStitchInLanguageModels: Dataset and methods for the exploration of idiomaticity in pre-trained language models
Despite their success in a variety of NLP tasks, pre-trained language models, due to their
heavy reliance on compositionality, fail in effectively capturing the meanings of multiword …
heavy reliance on compositionality, fail in effectively capturing the meanings of multiword …
Construction grammar provides unique insight into neural language models
Construction Grammar (CxG) has recently been used as the basis for probing studies that
have investigated the performance of large pretrained language models (PLMs) with respect …
have investigated the performance of large pretrained language models (PLMs) with respect …
[PDF][PDF] Processamento de Linguagem Natural: conceitos, técnicas e aplicações em português
O Processamento de Linguagem Natural (PLN) surgiu praticamente ao mesmo tempo que
os computadores, por volta da década de 1940, já que a tradução automática entre línguas …
os computadores, por volta da década de 1940, já que a tradução automática entre línguas …
Semantics of Multiword Expressions in Transformer-Based Models: A Survey
Multiword expressions (MWEs) are composed of multiple words and exhibit variable
degrees of compositionality. As such, their meanings are notoriously difficult to model, and it …
degrees of compositionality. As such, their meanings are notoriously difficult to model, and it …
ID10M: Idiom identification in 10 languages
Idioms are phrases which present a figurative meaning that cannot be (completely) derived
by looking at the meaning of their individual components. Identifying and understanding …
by looking at the meaning of their individual components. Identifying and understanding …
Distilling hypernymy relations from language models: On the effectiveness of zero-shot taxonomy induction
In this paper, we analyze zero-shot taxonomy learning methods which are based on
distilling knowledge from language models via prompting and sentence scoring. We show …
distilling knowledge from language models via prompting and sentence scoring. We show …
A systematic search for compound semantics in pretrained BERT architectures
To date, transformer-based models such as BERT have been less successful in predicting
compositionality of noun compounds than static word embeddings. This is likely related to a …
compositionality of noun compounds than static word embeddings. This is likely related to a …