A report on the VarDial evaluation campaign 2020 M Gaman, D Hovy, RT Ionescu, H Jauhiainen, T Jauhiainen, K Lindén, ... Proceedings of the 7th Workshop on NLP for Similar Languages, Varieties and …, 2020 | 60 | 2020 |
Findings of the VarDial evaluation campaign 2021 BR Chakravarthi, M Găman, RT Ionescu, H Jauhiainen, T Jauhiainen, ... EACL| VarDial, 2021 | 55 | 2021 |
" Do Not Celebrate Your Feast Without Your Neighbours": A Study of References to Feasts and Festivals in Non-literary Documents from Ramesside Period Deir El-Medina H Jauhiainen Helsingin yliopisto, 2009 | 46 | 2009 |
Evaluation of language identification methods using 285 languages T Jauhiainen, K Lindén, H Jauhiainen Proceedings of the 21st Nordic Conference on Computational Linguistics, 183-191, 2017 | 44 | 2017 |
HeLI, a word-based backoff method for language identification T Jauhiainen, K Lindén, H Jauhiainen Proceedings of the Third Workshop on NLP for Similar Languages, Varieties …, 2016 | 41 | 2016 |
HeLI-based experiments in Swiss German dialect identification TS Jauhiainen, HA Jauhiainen, BKJ Linden Workshop on NLP for Similar Languages, Varieties and Dialects, 254-262, 2018 | 39 | 2018 |
Aššur and his friends: a statistical analysis of neo-assyrian texts T Alstola, S Zaia, A Sahala, H Jauhiainen, S Svärd, K Lindén Journal of Cuneiform Studies 71 (1), 159-180, 2019 | 32 | 2019 |
Language and dialect identification of cuneiform texts T Jauhiainen, H Jauhiainen, T Alstola, K Lindén arXiv preprint arXiv:1903.01891, 2019 | 30 | 2019 |
Language model adaptation for language and dialect identification of text T Jauhiainen, K Lindén, H Jauhiainen Natural Language Engineering 25 (5), 561-583, 2019 | 27 | 2019 |
Discriminating similar languages with token-based backoff T Jauhiainen, H Jauhiainen, K Lindén Proceedings of the Joint Workshop on Language Technology for Closely Related …, 2015 | 26 | 2015 |
Fear in Akkadian texts: New digital perspectives on lexical semantics S Svärd, T Alstola, H Jauhiainen, A Sahala, K Lindén The Expression of Emotions in Ancient Egypt and Mesopotamia, 470-502, 2020 | 25 | 2020 |
Discriminating between Mandarin Chinese and Swiss-German varieties using adaptive language models TS Jauhiainen, HA Jauhiainen, BKJ Linden Workshop on NLP for Similar Languages, Varieties and Dialects, 178-187, 2019 | 25 | 2019 |
Iterative language model adaptation for Indo-Aryan language identification T Jauhiainen, H Jauhiainen, K Lindén Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties …, 2018 | 25 | 2018 |
Uralic Language Identification (ULI) 2020 shared task dataset and the Wanca 2017 corpus T Jauhiainen, H Jauhiainen, N Partanen, K Lindén arXiv preprint arXiv:2008.12169, 2020 | 23 | 2020 |
Language set identification in noisy synthetic multilingual documents T Jauhiainen, K Lindén, H Jauhiainen Computational Linguistics and Intelligent Text Processing: 16th …, 2015 | 20 | 2015 |
HeLI-OTS, off-the-shelf language identifier for text T Jauhiainen, H Jauhiainen, K Lindén International Conference on Language Resources and Evaluation, 3912-3922, 2022 | 18 | 2022 |
The finno-ugric languages and the internet project H Jauhiainen, T Jauhiainen, K Lindén Septentrio Conference Series, 87–98-87–98, 2015 | 18 | 2015 |
Naive Bayes-based experiments in Romanian dialect identification T Jauhiainen, H Jauhiainen, K Lindén Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties …, 2021 | 17 | 2021 |
Wanca in Korp: Text corpora for underresourced Uralic languages H Jauhiainen, T Jauhiainen, K Linden Proceedings of the Research data and humanities (RDHUM) 2019 conference, 21-40, 2019 | 15 | 2019 |
Evaluating HeLI with non-linear mappings T Jauhiainen, K Lindén, H Jauhiainen Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties …, 2017 | 15 | 2017 |