Comparison of text preprocessing methods
Text preprocessing is not only an essential step to prepare the corpus for modeling but also
a key area that directly affects the natural language processing (NLP) application results. For …
a key area that directly affects the natural language processing (NLP) application results. For …
Multiword expression processing: A survey
Multiword expressions (MWEs) are a class of linguistic forms spanning conventional word
boundaries that are both idiosyncratic and pervasive across different languages. The …
boundaries that are both idiosyncratic and pervasive across different languages. The …
Multiword expression identification with tree substitution grammars: A parsing tour de force with french
Multiword expressions (MWE), a known nui-sance for both linguistics and NLP, blur the lines
between syntax and semantics. Previous work on MWE identification has relied primar-ily on …
between syntax and semantics. Previous work on MWE identification has relied primar-ily on …
[PDF][PDF] A transition-based system for joint lexical and syntactic analysis
We present a transition-based system that jointly predicts the syntactic structure and lexical
units of a sentence by building two structures over the input words: a syntactic dependency …
units of a sentence by building two structures over the input words: a syntactic dependency …
Parsing models for identifying multiword expressions
Multiword expressions lie at the syntax/semantics interface and have motivated alternative
theories of syntax like Construction Grammar. Until now, however, syntactic analysis and …
theories of syntax like Construction Grammar. Until now, however, syntactic analysis and …
Without lexicons, multiword expression identification will never fly: A position statement
Because most multiword expressions (MWEs), especially verbal ones, are semantically non-
compositional, their automatic identification in running text is a prerequisite for semantically …
compositional, their automatic identification in running text is a prerequisite for semantically …
PARSEME–PARSing and Multiword Expressions within a European multilingual network
The aim of this paper is to present PARSEME, a COST Action devoted to the issue of
Multiword Expressions in parsing and in linguistic resources (corpora, lexicons). This is a …
Multiword Expressions in parsing and in linguistic resources (corpora, lexicons). This is a …
Efficient continue training of temporal language model with structural information
Z Su, J Li, Z Zhang, Z Zhou… - Findings of the Association …, 2023 - aclanthology.org
Current language models are mainly trained on snap-shots of data gathered at a particular
time, which decreases their capability to generalize over time and model language change …
time, which decreases their capability to generalize over time and model language change …
Collocations of fictive motion verbs in adventure tourism: A corpus-based study of the English language
This paper investigates the collocations produced by a set of fictive motion verbs found in a
specialized corpus representing the language of adventure tourism. Since our ultimate aim …
specialized corpus representing the language of adventure tourism. Since our ultimate aim …
[ספר][B] Facets of prefabrication. Perspectives on modelling and detecting phraseological units
P Pęzik - 2018 - ceeol.com
Corpus-based studies have brought fresh insights into the role of collocability and lexico-
grammatical patterning as core aspects of language permeating its structure and use. Facets …
grammatical patterning as core aspects of language permeating its structure and use. Facets …