Rolling out text categorization for language learning assessment supported by language technology
This paper is concerned with a tool that supports human experts in their task of classifying
text excerpts suitable to be used in quizzes for learning materials and as items of exams that …
text excerpts suitable to be used in quizzes for learning materials and as items of exams that …
Co** with highly imbalanced datasets: A case study with definition extraction in a multilingual setting
This paper addresses the task of automatic extraction of definitions by thoroughly exploring
an approach that solely relies on machine learning techniques, and by focusing on the issue …
an approach that solely relies on machine learning techniques, and by focusing on the issue …
Assessing automatic text classification for interactive language learning
A Branco, J Rodrigues, F Costa… - … on Information Society …, 2014 - ieeexplore.ieee.org
In this paper we discuss the design options for a language processing tool that supports
humans in their task of classifying text excerpts according to CEFR levels of language …
humans in their task of classifying text excerpts according to CEFR levels of language …
Universal grammatical dependencies for Portuguese with CINTIL data, LX processing and CLARIN support
The grammatical framework for the map** between linguistic form and meaning
representation known as Universal Dependencies relies on a non-constituency syntactic …
representation known as Universal Dependencies relies on a non-constituency syntactic …
Neural text categorization with transformers for learning portuguese as a second language
We report on the application of a neural network based approach to the problem of
automatically categorizing texts according to their proficiency levels and suitability for …
automatically categorizing texts according to their proficiency levels and suitability for …
[PDF][PDF] LemPORT: a high-accuracy cross-platform lemmatizer for portuguese
Although lemmatization is a very common subtask in many natural language processing
tasks, there is a lack of available true cross-platform lemmatization tools specifically targeted …
tasks, there is a lack of available true cross-platform lemmatization tools specifically targeted …
The BDCamões collection of Portuguese literary documents: a research resource for digital humanities and language technology
S Grilo, M Bolrinha, J Silva, R Vaz… - Proceedings of the …, 2020 - aclanthology.org
This paper presents the BDCamões Collection of Portuguese Literary Documents, a new
corpus of literary texts written in Portuguese that in its inaugural version includes close to 4 …
corpus of literary texts written in Portuguese that in its inaugural version includes close to 4 …
Combining a double clustering approach with sentence simplification to produce highly informative multi-document summaries
SB Silveira, A Branco - … on Information Reuse & Integration (IRI), 2012 - ieeexplore.ieee.org
This paper presents a method for extractive multi-document summarization that explores a
two-phase clustering approach that, combined with a sentence simplification procedure …
two-phase clustering approach that, combined with a sentence simplification procedure …
Historical Portuguese corpora: a survey
TF Osório, H Lopes Cardoso - Language Resources and Evaluation, 2024 - Springer
This survey aims to thoroughly examine and evaluate the current landscape of electronic
corpora in historical Portuguese. This is achieved through a comprehensive analysis of …
corpora in historical Portuguese. This is achieved through a comprehensive analysis of …
From greatest simplicity to full power: Research-Infrastructure-as-a-Service for language science and technology
While language processing services are key assets for the science and technology of
language, the possible ways under which they may be made available to the widest range of …
language, the possible ways under which they may be made available to the widest range of …