A method for named entity normalization in biomedical articles: application to diseases and plants

H Cho, W Choi, H Lee - BMC bioinformatics, 2017 - Springer
Background In biomedical articles, a named entity recognition (NER) technique that
identifies entity names from texts is an important element for extracting biological knowledge …

Ethical data curation for ai: An approach based on feminist epistemology and critical theories of race

S Leavy, E Siapera, B O'Sullivan - Proceedings of the 2021 AAAI/ACM …, 2021 - dl.acm.org
The potential for bias embedded in data to lead to the perpetuation of social injustice though
Artificial Intelligence (AI) necessitates an urgent reform of data curation practices for AI …

Through the limits of newspeak: an analysis of the vector representation of words in George Orwell's 1984

I Dunđer, M Pavlovski - 2019 42nd International Convention on …, 2019 - ieeexplore.ieee.org
The era of fake news, media manipulation and information wars has been beneficent to the
lasting fame and continuous acclaim of George Orwell's 1984. The novel, published in 1949 …

Curatr: a platform for semantic analysis and curation of historical literary texts

S Leavy, G Meaney, K Wade, D Greene - Metadata and Semantic …, 2019 - Springer
The increasing availability of digital collections of historical and contemporary literature
presents a wealth of possibilities for new research in the humanities. The scale and diversity …

Relation clustering in narrative knowledge graphs

S Mellace, K Vani, A Antonucci - ar** with literary texts such as novels or short stories, the extraction of structured
information in the form of a knowledge graph might be hindered by the huge number of …

Creation and evaluation of datasets for distributional semantics tasks in the digital humanities domain

G Wohlgenannt, A Barinova, D Ilvovsky… - arxiv preprint arxiv …, 2019 - arxiv.org
Word embeddings are already well studied in the general domain, usually trained on large
text corpora, and have been evaluated for example on word similarity and analogy tasks, but …

Relation extraction datasets in the digital humanities domain and their evaluation with word embeddings

G Wohlgenannt, E Chernyak, D Ilvovsky… - … and Intelligent Text …, 2018 - Springer
In this research, we manually create high-quality datasets in the digital humanities domain
for the evaluation of language models, specifically word embedding models. The first step …

Razmecheno: Named Entity Recognition from Digital Archive of Diaries" Prozhito"

T Atnashev, V Ganeeva, R Kazakov, D Matyash… - arxiv preprint arxiv …, 2022 - arxiv.org
The vast majority of existing datasets for Named Entity Recognition (NER) are built primarily
on news, research papers and Wikipedia with a few exceptions, created from historical and …

Navigating literary text with word embeddings and semantic lexicons

S Leavy, K Wade, G Meaney, D Greene - 2018 - researchrepository.ucd.ie
Word embeddings represent a powerful tool for mining the vocabularies of literary and
historical text. However, there is little research demonstrating appropriate strategies for …