[PDF][PDF] The Hungarian National Corpus.
T Váradi - LREC, 2002 - researchgate.net
The paper reports on the development of the Hungarian National Corpus, which was
completed at the end of 2001 after four years' effort. The HNC is designed to be a balanced …
completed at the end of 2001 after four years' effort. The HNC is designed to be a balanced …
[PDF][PDF] PurePos 2.0: a hybrid tool for morphological disambiguation
We present PurePos, an open-source HMM-based automatic morphological annotation tool.
PurePos can perform tagging and lemmatization at the same time, it is very fast to train, with …
PurePos can perform tagging and lemmatization at the same time, it is very fast to train, with …
[PDF][PDF] Develo** an automatic part-of-speech tagger for Scottish Gaelic
This paper describes an on-going project that seeks to develop the first automatic PoS
tagger for Scottish Gaelic. Adapting the PAROLE tagset for Irish, we manually re-tagged a …
tagger for Scottish Gaelic. Adapting the PAROLE tagset for Irish, we manually re-tagged a …
PurePos: An Open Source Morphological Disambiguator
This paper presents PurePos, a new open source Hidden Markov model based
morphological tagger tool that has an interface to an integrated morphological analyzer and …
morphological tagger tool that has an interface to an integrated morphological analyzer and …
[PDF][PDF] Web-based frequency dictionaries for medium density languages
Frequency dictionaries play an important role both in psycholinguistic experiment design
and in language technology. The paper describes a new, freely available, web-based …
and in language technology. The paper describes a new, freely available, web-based …
[PDF][PDF] The Role of Parallel Corpora in Bilingual Lexicography.
E Héja - LREC, 2010 - lexitron.nectec.or.th
This paper describes an approach based on word alignment on parallel corpora, which aims
at facilitating the lexicographic work of dictionary building. Although this method has been …
at facilitating the lexicographic work of dictionary building. Although this method has been …
Using embedding models for lexical categorization in morphologically rich languages
B Siklósi - … Linguistics and Intelligent Text Processing: 17th …, 2018 - Springer
Neural-network-based semantic embedding models are relatively new but popular tools in
the field of natural language processing. It has been shown that continuous embedding …
the field of natural language processing. It has been shown that continuous embedding …
Automatic structuring and correction suggestion system for Hungarian clinical records
The first steps of processing clinical documents are structuring and normalization. In this
paper we demonstrate how we compensate the lack of any structure in the raw data by …
paper we demonstrate how we compensate the lack of any structure in the raw data by …
Speech technologies for Serbian and kindred South Slavic languages
This chapter will present the results of the research and development of speech
technologies for Serbian and other kindred South Slavic languages used in five countries of …
technologies for Serbian and other kindred South Slavic languages used in five countries of …
[PDF][PDF] Using a morphological analyzer in high precision POS tagging of Hungarian
The paper presents an evaluation of maxent POS disambiguation systems that incorporate
an open source morphological analyzer to constrain the probabilistic models. The …
an open source morphological analyzer to constrain the probabilistic models. The …