[PDF][PDF] The Hungarian National Corpus.

T Váradi - LREC, 2002 - researchgate.net
The paper reports on the development of the Hungarian National Corpus, which was
completed at the end of 2001 after four years' effort. The HNC is designed to be a balanced …

[PDF][PDF] PurePos 2.0: a hybrid tool for morphological disambiguation

G Orosz, A Novák - … in Natural Language Processing RANLP 2013, 2013 - aclanthology.org
We present PurePos, an open-source HMM-based automatic morphological annotation tool.
PurePos can perform tagging and lemmatization at the same time, it is very fast to train, with …

[PDF][PDF] Develo** an automatic part-of-speech tagger for Scottish Gaelic

W Lamb, S Danso - Proceedings of the First Celtic Language …, 2014 - aclanthology.org
This paper describes an on-going project that seeks to develop the first automatic PoS
tagger for Scottish Gaelic. Adapting the PAROLE tagset for Irish, we manually re-tagged a …

PurePos: An Open Source Morphological Disambiguator

G Orosz, A Novák - … Workshop on Natural Language Processing and …, 2012 - scitepress.org
This paper presents PurePos, a new open source Hidden Markov model based
morphological tagger tool that has an interface to an integrated morphological analyzer and …

[PDF][PDF] Web-based frequency dictionaries for medium density languages

A Kornai, P Halácsy, V Nagy, C Oravecz… - Proceedings of the …, 2006 - aclanthology.org
Frequency dictionaries play an important role both in psycholinguistic experiment design
and in language technology. The paper describes a new, freely available, web-based …

[PDF][PDF] The Role of Parallel Corpora in Bilingual Lexicography.

E Héja - LREC, 2010 - lexitron.nectec.or.th
This paper describes an approach based on word alignment on parallel corpora, which aims
at facilitating the lexicographic work of dictionary building. Although this method has been …

Using embedding models for lexical categorization in morphologically rich languages

B Siklósi - … Linguistics and Intelligent Text Processing: 17th …, 2018 - Springer
Neural-network-based semantic embedding models are relatively new but popular tools in
the field of natural language processing. It has been shown that continuous embedding …

Automatic structuring and correction suggestion system for Hungarian clinical records

B Siklósi, G Orosz, A Novák, G Prószéky - 2012 - real.mtak.hu
The first steps of processing clinical documents are structuring and normalization. In this
paper we demonstrate how we compensate the lack of any structure in the raw data by …

Speech technologies for Serbian and kindred South Slavic languages

V Delić, M Sečujski, N Jakovljević… - Advances in Speech …, 2010 - books.google.com
This chapter will present the results of the research and development of speech
technologies for Serbian and other kindred South Slavic languages used in five countries of …

[PDF][PDF] Using a morphological analyzer in high precision POS tagging of Hungarian

P Halácsy, A Kornai, C Oravecz, T Vikto, D Varga - 2006 - eprints.sztaki.hu
The paper presents an evaluation of maxent POS disambiguation systems that incorporate
an open source morphological analyzer to constrain the probabilistic models. The …