The PROIEL treebank family: a standard for early attestations of Indo-European languages

H Eckhoff, K Bech, G Bouma, K Eide, D Haug… - Language resources …, 2018 - Springer
This article describes a family of dependency treebanks of early attestations of Indo-
European languages originating in the parallel treebank built by the members of the project …

Automatic parsing as an efficient pre-annotation tool for historical texts

HM Eckhoff, A Berdičevskis - Proceedings of the Workshop on …, 2016 - aclanthology.org
Historical treebanks tend to be manually annotated, which is not surprising, since state-of-
the-art parsers are not accurate enough to ensure high-quality annotation for historical texts …

Neural Morphological Tagging for Slavic: Strengths and Weaknesses

J Besters-Dilger - Scripta & e-Scripta, 2021 - ceeol.com
The neural network tagger CLStM has been applied to the Old Russian Žitie Evfimija
Velikogo (GIM, Chud. 20), a copy of the second half of the 14th century. The strengths of this …

A reusable tagset for the morphologically rich language in change: A case of Middle Russian1

ON Lyashevskaya - Komp'juternaja Lingvistika i Intellektual'nye …, 2019 - elibrary.ru
The paper discusses the standardization efforts to create a morphological standard for the
Middle Russian corpus, which is part of the historical collection of the Russian National …

[PDF][PDF] The Effect of (Historical) Language Variation on the East Slavic Lects Lematisers Performance

I Afanasev, O Lyashevskaya, S Rebrikov… - Journal of Linguistics …, 2023 - sciendo.com
The need to develop tools for historical and regional variations is becoming more urgent in
natural language processing. In this paper, we present two candidate systems for …

[PDF][PDF] Disambiguation in context in the Russian National Corpus: 20 yeas later

MTS AI - Proceedings of the International Conference “Dialogue, 2023 - researchgate.net
An updated annotation of the Main, Media, and some other corpora of the Russian National
Corpus (RNC) features the part-of-speech and other morphological information, lemmas …

String Similarity Measures for Evaluating the

I Afanasev, O Lyashevskaya - Structuring Lexical Data and …, 2024 - books.google.com
Modern historical lexicography faces the need of adopting NLP methods, with an automatic
corpus/dictionary system being a possible pipeline. This approach requires a scalable …

The project of a deeply tagged parallel corpus of Middle Russian translations from Latin

EG Sokolov - Journal of Applied Linguistics and Lexicography, 2019 - cyberleninka.ru
Tagged parallel corpora are powerful tools for the analysis of natural language. Moreover,
for historical linguistics, whose most peculiar shortcoming is lack of living native speakers …

[PDF][PDF] Building and using online corpora for (historical) linguistic research

M Jøhndal - Computational Linguistics and the Dating of Early Irish …, 2016 - johndal.com
Building and using online corpora for (historical) linguistic research Page 1 Building and using
online corpora for (historical) linguistic research Marius L. Jøhndal 15 December 2016 …

Lexico-grammatical annotation of the middle russian corpus 1400-1700: a computational approach

G Tat'iana, S Tat'iana, L Ol'ga - St. Tikhon's University Review …, 2016 - periodical.pstgu.ru
The paper discusses two approaches to the automatic lexico-grammatical tagging of the
Middle Russian texts (1400–1700), included in the Russian National Corpus (RNC). The …