Develo** an open‐source, rule‐based proofreading tool
In this paper, we show how an open‐source, language‐independent proofreading tool has
been built. Many languages lack contextual proofreading tools; for many others, only partial …
been built. Many languages lack contextual proofreading tools; for many others, only partial …
[PDF][PDF] National corpus of polish
The paper presents the main results of the National Corpus of Polish project, which took
place from December 2007 to June 2011, including: the sizes of the main corpus and …
place from December 2007 to June 2011, including: the sizes of the main corpus and …
Towards a bank of constituent parse trees for Polish
We present a project aimed at construction of a bank of constituent parse trees for 20,000
Polish sentences taken from the balanced hand-annotated subcorpus of the National …
Polish sentences taken from the balanced hand-annotated subcorpus of the National …
Fextor: A feature extraction framework for natural language processing: A case study in word sense disambiguation, relation recognition and anaphora resolution
Feature extraction from text corpora is an important step in Natural Language Processing
(NLP), especially for Machine Learning (ML) techniques. Various NLP tasks have many …
(NLP), especially for Machine Learning (ML) techniques. Various NLP tasks have many …
[PDF][PDF] Named entity recognition in machine anonymization
The paper presents a formalism for the rule-based Named Entity Recognition (NER). In
comparison to existing solutions the new features of the formalism are: applicability for …
comparison to existing solutions the new features of the formalism are: applicability for …
Tools and methodologies for annotating syntax and named entities in the National Corpus of Polish
The on-going project aiming at the creation of the National Corpus of Polish assumes
several levels of linguistic annotation. We present the technical environment and …
several levels of linguistic annotation. We present the technical environment and …
Annotation tools for syntax and named entities in the National Corpus of Polish
The ongoing National Corpus of Polish project assumes several levels of linguistic
annotation. We present the technical environment and methodological background …
annotation. We present the technical environment and methodological background …
Morfoskładnia w Słowniku właściwych użyć języka.
This article outlines the theoretical basis of describing morphosyntactic problems in The
Dictionary of Proper Uses of Language. The main point was to 1) show the principles of the …
Dictionary of Proper Uses of Language. The main point was to 1) show the principles of the …
Boosting question answering by deep entity recognition
In this paper an open-domain factoid question answering system for Polish, RAFAEL, is
presented. The system goes beyond finding an answering sentence; it also extracts a single …
presented. The system goes beyond finding an answering sentence; it also extracts a single …
Nested term recognition driven by word connection strength
Domain corpora are often not very voluminous and even important terms can occur in them
not as isolated maximal phrases but only within more complex constructions. Appropriate …
not as isolated maximal phrases but only within more complex constructions. Appropriate …