A method for named entity normalization in biomedical articles: application to diseases and plants
Background In biomedical articles, a named entity recognition (NER) technique that
identifies entity names from texts is an important element for extracting biological knowledge …
identifies entity names from texts is an important element for extracting biological knowledge …
Ethical data curation for ai: An approach based on feminist epistemology and critical theories of race
The potential for bias embedded in data to lead to the perpetuation of social injustice though
Artificial Intelligence (AI) necessitates an urgent reform of data curation practices for AI …
Artificial Intelligence (AI) necessitates an urgent reform of data curation practices for AI …
Through the limits of newspeak: an analysis of the vector representation of words in George Orwell's 1984
I Dunđer, M Pavlovski - 2019 42nd International Convention on …, 2019 - ieeexplore.ieee.org
The era of fake news, media manipulation and information wars has been beneficent to the
lasting fame and continuous acclaim of George Orwell's 1984. The novel, published in 1949 …
lasting fame and continuous acclaim of George Orwell's 1984. The novel, published in 1949 …
Curatr: a platform for semantic analysis and curation of historical literary texts
The increasing availability of digital collections of historical and contemporary literature
presents a wealth of possibilities for new research in the humanities. The scale and diversity …
presents a wealth of possibilities for new research in the humanities. The scale and diversity …
Relation clustering in narrative knowledge graphs
S Mellace, K Vani, A Antonucci - ar** with literary texts such as novels or short stories, the extraction of structured
information in the form of a knowledge graph might be hindered by the huge number of …
information in the form of a knowledge graph might be hindered by the huge number of …
Creation and evaluation of datasets for distributional semantics tasks in the digital humanities domain
Word embeddings are already well studied in the general domain, usually trained on large
text corpora, and have been evaluated for example on word similarity and analogy tasks, but …
text corpora, and have been evaluated for example on word similarity and analogy tasks, but …
Relation extraction datasets in the digital humanities domain and their evaluation with word embeddings
In this research, we manually create high-quality datasets in the digital humanities domain
for the evaluation of language models, specifically word embedding models. The first step …
for the evaluation of language models, specifically word embedding models. The first step …
Razmecheno: Named Entity Recognition from Digital Archive of Diaries" Prozhito"
T Atnashev, V Ganeeva, R Kazakov, D Matyash… - arxiv preprint arxiv …, 2022 - arxiv.org
The vast majority of existing datasets for Named Entity Recognition (NER) are built primarily
on news, research papers and Wikipedia with a few exceptions, created from historical and …
on news, research papers and Wikipedia with a few exceptions, created from historical and …
Navigating literary text with word embeddings and semantic lexicons
Word embeddings represent a powerful tool for mining the vocabularies of literary and
historical text. However, there is little research demonstrating appropriate strategies for …
historical text. However, there is little research demonstrating appropriate strategies for …