The role of typological feature prediction in NLP and linguistics

J Bjerva - Computational Linguistics, 2023 - direct.mit.edu
Computational typology has gained traction in the field of Natural Language Processing
(NLP) in recent years, as evidenced by the increasing number of papers on the topic and the …

A crosslingual investigation of conceptualization in 1335 languages

Y Liu, H Ye, L Weissweiler, P Wicke, R Pei… - ar** cross-lingual datasets: The case of phonology, concreteness, and affectiveness
Y Chen, J Bjerva - arxiv preprint arxiv:2306.02646, 2023 - arxiv.org
Colexification refers to the linguistic phenomenon where a single lexical form is used to
convey multiple meanings. By studying cross-lingual colexifications, researchers have …

Multilingual Gradient Word-Order Typology from Universal Dependencies

E Baylor, E Ploeger, J Bjerva - arxiv preprint arxiv:2402.01513, 2024 - arxiv.org
While information from the field of linguistic typology has the potential to improve
performance on NLP tasks, reliable typological data is a prerequisite. Existing typological …

[PDF][PDF] Machine Learning and Transformer Models for Language-Independent Sentiment Analysis

F Ullah, S Faizullah, IU Khan, T Alghamdi… - Computational …, 2024 - researchgate.net
Experimental results demonstrate that transformer models, particularly XLM-RoBERTa with
prompt-based fine-tuning, outperform both classical and deep learning methods. The results …

Patterns of Persistence and Diffusibility across World's Languages

Y Chen, J Bjerva - arxiv preprint arxiv:2401.01698, 2024 - arxiv.org
Language similarities can be caused by genetic relatedness, areal contact, universality, or
chance. Colexification, ie~ a type of similarity where a single lexical form is used to convey …

[PDF][PDF] Exploring Phonetic Features in Language Embeddings for Unseen Language Varieties of Austrian German

L Gutscher, M Pucher - Proceedings of the 20th Conference on …, 2024 - aclanthology.org
Vectorized language embeddings of raw audio data improve tasks like language
recognition, automatic speech recognition, and machine translation. Although embeddings …