Natural language processing for dialects of a language: A survey

A Joshi, R Dabre, D Kanojia, Z Li, H Zhan… - ACM Computing …, 2024 - dl.acm.org
State-of-the-art natural language processing (NLP) models are trained on massive training
corpora, and report a superlative performance on evaluation datasets. This survey delves …

[PDF][PDF] Overview of the 5th author profiling task at pan 2017: Gender and language variety identification in twitter

F Rangel, P Rosso, M Potthast… - Working notes papers of …, 2017 - downloads.webis.de
This overview presents the framework and the results of the Author Profiling task at PAN
2017. The objective of this year is to address gender and language variety identification. For …

Automatic language identification in texts: A survey

T Jauhiainen, M Lui, M Zampieri, T Baldwin… - Journal of Artificial …, 2019 - jair.org
Language identification (" LI") is the problem of determining the natural language that a
document or part thereof is written in. Automatic LI has been extensively researched for over …

Computational sociolinguistics: A survey

D Nguyen, AS Doğruöz, CP Rosé… - Computational …, 2016 - direct.mit.edu
Abstract Language is a social phenomenon and variation is inherent to its social nature.
Recently, there has been a surge of interest within the computational linguistics (CL) …

Discriminating between similar languages and arabic dialect identification: A report on the third dsl shared task

S Malmasi, M Zampieri, N Ljubešić… - Proceedings of the …, 2016 - aclanthology.org
We present the results of the third edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the VarDial'2016 workshop at …

Findings of the VarDial evaluation campaign 2017

M Zampieri, S Malmasi, N Ljubešić… - Proceedings of the …, 2017 - aclanthology.org
We present the results of the VarDial Evaluation Campaign on Natural Language
Processing (NLP) for Similar Languages, Varieties and Dialects, which we organized as part …

[PDF][PDF] A report on the DSL shared task 2014

M Zampieri, L Tan, N Ljubešić… - Proceedings of the first …, 2014 - aclanthology.org
This paper summarizes the methods, results and findings of the Discriminating between
Similar Languages (DSL) shared task 2014. The shared task provided data from 13 different …

[PDF][PDF] Overview of the DSL shared task 2015

M Zampieri, L Tan, N Ljubešić… - Proceedings of the …, 2015 - aclanthology.org
We present the results of the 2nd edition of the Discriminating between Similar Languages
(DSL) shared task, which was organized as part of the LT4VarDial'2015 workshop and …

Natural language processing for similar languages, varieties, and dialects: A survey

M Zampieri, P Nakov, Y Scherrer - Natural Language Engineering, 2020 - cambridge.org
There has been a lot of recent interest in the natural language processing (NLP) community
in the computational processing of language varieties and dialects, with the aim to improve …

Language variety identification with true labels

M Zampieri, K North, T Jauhiainen, M Felice… - arxiv preprint arxiv …, 2023 - arxiv.org
Language identification is an important first step in many IR and NLP applications. Most
publicly available language identification datasets, however, are compiled under the …