[КНИГА][B] Corpus linguistics and statistics with R

G Desagulier, G Desagulier, Amboy - 2017 - Springer
In the summer of 2008, I gave a talk at an international conference in Brighton. The talk was
about constructions involving multiple hedging in American English (eg, I'm gonna have to …

Discourse markers and (dis) fluency

L Crible - 2018 - torrossa.com
2.1 Disfluency or repair? Levelt's legacy 10 2.2 Holistic definitions of fluency 13 2.3
Componential approaches to fluency and disfluency 14 2.3. 1 Qualitative components of …

The status of function words in dependency grammar: A critique of Universal Dependencies (UD)

T Osborne, K Gerdes - Glossa: a journal of general linguistics …, 2019 - inria.hal.science
The article examines the Universal Dependencies (UD) annotation scheme. The UD project
is an international initiative to produce treebanks of the world's languages, whereby the …

Sequences of Intonation Units form a~ 1 Hz rhythm

M Inbar, E Grossman, AN Landau - Scientific reports, 2020 - nature.com
Studies of speech processing investigate the relationship between temporal structure in
speech stimuli and neural activity. Despite clear evidence that the brain tracks speech at low …

Duel: A multi-lingual multimodal dialogue corpus for disfluency, exclamations and laughter

J Hough, Y Tian, L De Ruiter, S Betz… - Proceedings of the …, 2016 - aclanthology.org
We present the DUEL corpus, consisting of 24 hours of natural, face-to-face, loosely task-
directed dialogue in German, French and Mandarin Chinese. The corpus is uniquely …

Crowdsourcing complex language resources: Playing to annotate dependency syntax

B Guillaume, K Fort, N Lefebvre - International Conference on …, 2016 - inria.hal.science
This article presents the results we obtained on a complex annotation task (that of
dependency syntax) using a specifically designed Game with a Purpose, ZombiLingo. We …

A geometric notion of causal probing

C Guerner, A Svete, T Liu, A Warstadt… - arxiv preprint arxiv …, 2023 - arxiv.org
The linear subspace hypothesis (Bolukbasi et al., 2016) states that, in a language model's
representation space, all information about a concept such as verbal number is encoded in …

A systematic analysis of morphological content in BERT models for multiple languages

D Edmiston - arxiv preprint arxiv:2004.03032, 2020 - arxiv.org
This work describes experiments which probe the hidden representations of several BERT-
style models for morphological content. The goal is to examine the extent to which discrete …

Treebanking user-generated content: a UD based overview of guidelines, corpora and unified recommendations

M Sanguinetti, C Bosco, L Cassidy, Ö Çetinoğlu… - Language Resources …, 2023 - Springer
This article presents a discussion on the main linguistic phenomena which cause difficulties
in the analysis of user-generated texts found on the web and in social media, and proposes …

Discourse markers and (dis) fluency in English and French: Variation and combination in the DisFrEn corpus

L Crible - International Journal of Corpus Linguistics, 2017 - jbe-platform.com
While discourse markers (DMs) and (dis) fluency have been extensively studied in the past
as separate phenomena, corpus-based research combining large-scale yet fine-grained …