A brief survey of textual dialogue corpora

HG Oliveira, P Ferreira, D Martins… - Proceedings of the …, 2022 - aclanthology.org
Several dialogue corpora are currently available for research purposes, but they still fall
short for the growing interest in the development of dialogue systems with their own specific …

Robust dialogue state tracking with weak supervision and sparse data

M Heck, N Lubis, C Niekerk, S Feng… - Transactions of the …, 2022 - direct.mit.edu
Generalizing dialogue state tracking (DST) to new data is especially challenging due to the
strong reliance on abundant and fine-grained supervision during training. Sample sparsity …

BaSCo: An annotated Basque-Spanish code-switching corpus for natural language understanding

M Aguirre, L García-Sardiña, M Serras… - Proceedings of the …, 2022 - aclanthology.org
The main objective of this work is the elaboration and public release of BaSCo, the first
corpus with annotated linguistic resources encompassing Basque-Spanish code-switching …

Euska\~ nolDS: A Naturally Sourced Corpus for Basque-Spanish Code-Switching

M Heredia, J Barnes, A Soroa - arxiv preprint arxiv:2502.03188, 2025 - arxiv.org
Code-switching (CS) remains a significant challenge in Natural Language Processing
(NLP), mainly due a lack of relevant data. In the context of the contact between the Basque …

Knowledge transfer for active learning in textual anonymisation

L García-Sardiña, M Serras, A del Pozo - Statistical Language and Speech …, 2018 - Springer
Data privacy compliance has gained a lot of attention over the last years. The automation of
the de-identification process is a challenging task that often requires annotating in-domain …