Multiword expression processing: A survey

M Constant, G Eryiğit, J Monti, L Van Der Plas… - Computational …, 2017 - direct.mit.edu
Multiword expressions (MWEs) are a class of linguistic forms spanning conventional word
boundaries that are both idiosyncratic and pervasive across different languages. The …

AStitchInLanguageModels: Dataset and methods for the exploration of idiomaticity in pre-trained language models

HT Madabushi, E Gow-Smith, C Scarton… - arxiv preprint arxiv …, 2021 - arxiv.org
Despite their success in a variety of NLP tasks, pre-trained language models, due to their
heavy reliance on compositionality, fail in effectively capturing the meanings of multiword …

Discriminative lexical semantic segmentation with gaps: running the MWE gamut

N Schneider, E Danchik, C Dyer… - Transactions of the …, 2014 - direct.mit.edu
We present a novel representation, evaluation measure, and supervised models for the task
of identifying the multiword expressions (MWEs) in a sentence, resulting in a lexical …

A corpus and model integrating multiword expressions and supersenses

N Schneider, NA Smith - A corpus and model integrating …, 2015 - research.ed.ac.uk
This paper introduces a task of identifying and semantically classifying lexical expressions in
running text. We investigate the online reviews genre, adding semantic supersense …

[PDF][PDF] A transition-based system for joint lexical and syntactic analysis

M Constant, J Nivre - Proceedings of the 54th Annual Meeting of …, 2016 - aclanthology.org
We present a transition-based system that jointly predicts the syntactic structure and lexical
units of a sentence by building two structures over the input words: a syntactic dependency …

SemEval-2016 Task~ 10: Detecting Minimal Semantic Units and their Meanings (DiMSUM)

N Schneider, D Hovy, A Johannsen… - … Workshop on Semantic …, 2016 - research.ed.ac.uk
This task combines the labeling of multiword expressions and supersenses (coarse-grained
classes) in an explicit, yet broad-coverage paradigm for lexical semantics. Nine systems …

[PDF][PDF] PARSEME multilingual corpus of verbal multiword expressions

A Savary, M Candito, VB Mititelu, E Bejček… - … expressions at length …, 2018 - library.oapen.org
One of the basic ideas underlying linguistic modelling is compositionality (Baggio et al.
2012), seen as a property of language items (Janssen 2001; Partee et al. 1990) or of …

[PDF][PDF] A multiword expression data set: Annotating non-compositionality and conventionalization for english noun compounds

M Farahmand, A Smith, J Nivre - Proceedings of the 11th …, 2015 - aclanthology.org
Scarcity of multiword expression data sets raises a fundamental challenge to evaluating the
systems that deal with these linguistic structures. In this work we attempt to address this …

Corpus annotation

J Newman, C Cox - A practical handbook of corpus linguistics, 2021 - Springer
In this chapter, we provide an overview of the main concepts relating to corpus annotation,
along with some discussion of the practical aspects of creating annotated texts and working …

As Hill seems to suggest: Variability in formulaic sequences with interpersonal functions in L1 novice and expert academic writing

Y Wang - Journal of English for Academic Purposes, 2018 - Elsevier
Formulaic sequences (FSs) are pervasive in natural language use and play an important
role in differentiating socially-situated practices. The predominant trend in this research area …