- Academic Search

V Păiș, M Mitrofan, CL Gasan… - Proceedings of the …, 2021 - aclanthology.org

Recognition of named entities present in text is an important step towards information
extraction and natural language understanding. This work presents a named entity …

Save Cite Cited by 40 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mtak.hu

Abstractive text summarization for Hungarian

ZG Yang, Á Agócs, G Kusper, T Váradi - Annales Mathematicae et …, 2021 - real.mtak.hu

In our research we have created a text summarization software tool for Hungarian using
multilingual and Hungarian BERT-based models. Two types of text summarization method …

Save Cite Cited by 21 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] aclanthology.org

Introducing the CURLICAT corpora: seven-language domain specific annotated corpora from curated sources

T Váradi, B Nyéki, S Koeva, M Tadić… - Proceedings of the …, 2022 - aclanthology.org

This article presents the current outcomes of the CURLICAT CEF Telecom project, which
aims to collect and deeply annotate a set of large corpora from selected domains. The …

Save Cite Cited by 12 Related articles All 4 versions Free GPT-4 View as HTML

UlyssesNER-Br: a corpus of Brazilian legislative documents for named entity recognition

HO Albuquerque, R Costa, G Silvestre, E Souza… - … Processing of the …, 2022 - Springer

The amount of legislative documents produced within the past decade has risen
dramatically, making it difficult for law practitioners to consult and update legislation. Named …

Save Cite Cited by 18 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] researchgate.net

Ulysses Tesemõ: a new large corpus for Brazilian legal and governmental domain

FA Siqueira, D Vitório, E Souza, JAP Santos… - Language Resources …, 2024 - Springer

The increasing use of artificial intelligence methods in the legal field has sparked interest in
applying Natural Language Processing techniques to handle legal tasks and reduce the …

Save Cite Cited by 1 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

FuLG: 150B Romanian Corpus for Language Model Pretraining

VA Bădoiu, MV Dumitru, AM Gherghescu… - arxiv preprint arxiv …, 2024 - arxiv.org

Research in the field of language models is rapidly evolving, with many open models being
released to the public. Openly available pretraining corpora usually focus on only a handful …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Histnero: Historical named entity recognition for the romanian language

AM Avram, A Iuga, GV Manolache, VC Matei… - … on Document Analysis …, 2024 - Springer

This work introduces HistNERo, the first Romanian corpus for Named Entity Recognition
(NER) in historical newspapers. The dataset contains 323k tokens of text, covering more …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] unimi.it

Annotators-in-the-loop: testing a novel annotation procedure on Italian case law

E Zanoli, M Barbini, D Riva, S Picascia… - Proceedings of the 17th …, 2023 - air.unimi.it

The availability of annotated legal corpora is crucial for a number of tasks, such as legal
search, legal information retrieval, and predictive justice. Annotation is mostly assumed to be …

Save Cite Cited by 2 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Automatic Extraction of the Romanian Academic Word List: Data and Methods

AM Bucur, A Dincă, M Chitez, R Rogobete - arxiv preprint arxiv …, 2023 - arxiv.org

This paper presents the methodology and data used for the automatic extraction of the
Romanian Academic Word List (Ro-AWL). Academic Word Lists are useful in both L2 and L1 …

Save Cite Cited by 2 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] romjist.ro

[PDF][PDF] In-depth evaluation of Romanian natural language processing pipelines

V Pais, R Ion, AM Avram, M Mitrofan, D Tufis - Romanian Journal of …, 2021 - romjist.ro

With the increased size of Universal Dependencies tree banks, several basic language
processing kits (BLARK) for multiple languages appeared in recent years, indicating …

Save Cite Cited by 12 Related articles All 4 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

The MARCELL legislative corpus

Named entity recognition in the Romanian legal domain

Abstractive text summarization for Hungarian

Introducing the CURLICAT corpora: seven-language domain specific annotated corpora from curated sources

UlyssesNER-Br: a corpus of Brazilian legislative documents for named entity recognition

Ulysses Tesemõ: a new large corpus for Brazilian legal and governmental domain

FuLG: 150B Romanian Corpus for Language Model Pretraining

Histnero: Historical named entity recognition for the romanian language

Annotators-in-the-loop: testing a novel annotation procedure on Italian case law

Automatic Extraction of the Romanian Academic Word List: Data and Methods

[PDF][PDF] In-depth evaluation of Romanian natural language processing pipelines