Multilingual is not enough: BERT for Finnish
Deep learning-based language models pretrained on large unannotated text corpora have
been demonstrated to allow efficient transfer learning for natural language processing, with …
been demonstrated to allow efficient transfer learning for natural language processing, with …
Universal dependencies v1: A multilingual treebank collection
Cross-linguistically consistent annotation is necessary for sound comparative evaluation
and cross-lingual learning experiments. It is also useful for multilingual system development …
and cross-lingual learning experiments. It is also useful for multilingual system development …
[PDF][PDF] Universal Stanford dependencies: A cross-linguistic typology.
Revisiting the now de facto standard Stanford dependency representation, we propose an
improved taxonomy to capture grammatical relations across languages, including …
improved taxonomy to capture grammatical relations across languages, including …
Joint morphological and syntactic analysis for richly inflected languages
Joint morphological and syntactic analysis has been proposed as a way of improving
parsing accuracy for richly inflected languages. Starting from a transition-based model for …
parsing accuracy for richly inflected languages. Starting from a transition-based model for …
From the world to word order: Deriving biases in noun phrase order from statistical properties of the world
The world's languages exhibit striking diversity. At the same time, recurring linguistic
patterns suggest the possibility that this diversity is shaped by features of human cognition …
patterns suggest the possibility that this diversity is shaped by features of human cognition …
FinEst BERT and CroSloEngual BERT: less is more in multilingual models
M Ulčar, M Robnik-Šikonja - … Conference, TSD 2020, Brno, Czech Republic …, 2020 - Springer
Large pretrained masked language models have become state-of-the-art solutions for many
NLP problems. The research has been mostly focused on English language, though. While …
NLP problems. The research has been mostly focused on English language, though. While …
A broad-coverage corpus for Finnish named entity recognition
J Luoma, M Oinonen, M Pyykönen… - Proceedings of the …, 2020 - aclanthology.org
We present a new manually annotated corpus for broad-coverage named entity recognition
for Finnish. Building on the original Universal Dependencies Finnish corpus of 754 …
for Finnish. Building on the original Universal Dependencies Finnish corpus of 754 …
Classifying online corporate reputation with machine learning: a study in the banking domain
Purpose User-generated social media comments can be a useful source of information for
understanding online corporate reputation. However, the manual classification of these …
understanding online corporate reputation. However, the manual classification of these …
Exploring predictive uncertainty and calibration in NLP: A study on the impact of method & data scarcity
We investigate the problem of determining the predictive confidence (or, conversely,
uncertainty) of a neural classifier through the lens of low-resource languages. By training …
uncertainty) of a neural classifier through the lens of low-resource languages. By training …
IMST: A revisited Turkish dependency treebank
U Sulubacak, G Eryiğit… - … Conference on Turkic …, 2016 - researchportal.helsinki.fi
In this paper, we present a critical analysis of the dependency annotation framework used in
the METU-Sabancı Treebank (MST), and propose new annotation schemes that would …
the METU-Sabancı Treebank (MST), and propose new annotation schemes that would …