Develo** an open‐source, rule‐based proofreading tool

M Miłkowski - Software: Practice and Experience, 2010‏ - Wiley Online Library
In this paper, we show how an open‐source, language‐independent proofreading tool has
been built. Many languages lack contextual proofreading tools; for many others, only partial …

[PDF][PDF] National corpus of polish

A Przepiórkowski, M Bańko, RL Górski… - Proceedings of the 5th …, 2011‏ - academia.edu
The paper presents the main results of the National Corpus of Polish project, which took
place from December 2007 to June 2011, including: the sizes of the main corpus and …

Towards a bank of constituent parse trees for Polish

M Świdziński, M Woliński - International Conference on Text, Speech and …, 2010‏ - Springer
We present a project aimed at construction of a bank of constituent parse trees for 20,000
Polish sentences taken from the balanced hand-annotated subcorpus of the National …

Fextor: A feature extraction framework for natural language processing: A case study in word sense disambiguation, relation recognition and anaphora resolution

B Broda, P Kędzia, M Marcińczuk… - Computational …, 2013‏ - Springer
Feature extraction from text corpora is an important step in Natural Language Processing
(NLP), especially for Machine Learning (ML) techniques. Various NLP tasks have many …

[PDF][PDF] Named entity recognition in machine anonymization

F Graliński, K Jassem, M Marcińczuk… - Recent Advances in …, 2009‏ - aclanthology.org
The paper presents a formalism for the rule-based Named Entity Recognition (NER). In
comparison to existing solutions the new features of the formalism are: applicability for …

Tools and methodologies for annotating syntax and named entities in the National Corpus of Polish

J Waszczuk, K Glowińska, A Savary… - Proceedings of the …, 2010‏ - ieeexplore.ieee.org
The on-going project aiming at the creation of the National Corpus of Polish assumes
several levels of linguistic annotation. We present the technical environment and …

Annotation tools for syntax and named entities in the National Corpus of Polish

J Waszczuk, K Głowińska, A Savary… - … Journal of Data …, 2013‏ - inderscienceonline.com
The ongoing National Corpus of Polish project assumes several levels of linguistic
annotation. We present the technical environment and methodological background …

Morfoskładnia w Słowniku właściwych użyć języka.

M Gębka-Wolak, A Moroz - Język Polski, 2021‏ - search.ebscohost.com
This article outlines the theoretical basis of describing morphosyntactic problems in The
Dictionary of Proper Uses of Language. The main point was to 1) show the principles of the …

Boosting question answering by deep entity recognition

P Przybyła - arxiv preprint arxiv:1605.08675, 2016‏ - arxiv.org
In this paper an open-domain factoid question answering system for Polish, RAFAEL, is
presented. The system goes beyond finding an answering sentence; it also extracts a single …

Nested term recognition driven by word connection strength

M Marciniak, A Mykowiecka - … of Theoretical and Applied Issues in …, 2015‏ - jbe-platform.com
Domain corpora are often not very voluminous and even important terms can occur in them
not as isolated maximal phrases but only within more complex constructions. Appropriate …