[PDF][PDF] Named Entity Recognition, Concept Normalization and Clinical Coding: Overview of the Cantemist Track for Cancer Text Mining in Spanish, Corpus, Guidelines …

A Miranda-Escalada, E Farré, M Krallinger - IberLEF@ SEPLN, 2020 - researchgate.net
Cancer still represents one of the leading causes of death worldwide, resulting in a
considerable healthcare impact. Recent research efforts from the clinical and molecular …

Computational tools for prioritizing candidate genes: boosting disease gene discovery

Y Moreau, LC Tranchevent - Nature Reviews Genetics, 2012 - nature.com
At different stages of any research project, molecular biologists need to choose—often
somewhat arbitrarily, even after careful statistical data analysis—which genes or proteins to …

Domain-specific language model pretraining for biomedical natural language processing

Y Gu, R Tinn, H Cheng, M Lucas, N Usuyama… - ACM Transactions on …, 2021 - dl.acm.org
Pretraining large neural language models, such as BERT, has led to impressive gains on
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …

The BioGRID interaction database: 2013 update

A Chatr-Aryamontri, BJ Breitkreutz… - Nucleic acids …, 2012 - academic.oup.com
Abstract The Biological General Repository for Interaction Datasets (BioGRID:
http//thebiogrid. org) is an open access archive of genetic and protein interactions that are …

Community challenges in biomedical text mining over 10 years: success, failure and the future

CC Huang, Z Lu - Briefings in bioinformatics, 2016 - academic.oup.com
One effective way to improve the state of the art is through competitions. Following the
success of the Critical Assessment of protein Structure Prediction (CASP) in bioinformatics …

[HTML][HTML] Biomedical text mining and its applications in cancer research

F Zhu, P Patumcharoenpol, C Zhang, Y Yang… - Journal of biomedical …, 2013 - Elsevier
Cancer is a malignant disease that has caused millions of human deaths. Its study has a
long history of well over 100years. There have been an enormous number of publications on …

Memorization vs. generalization: Quantifying data leakage in NLP performance evaluation

A Elangovan, J He, K Verspoor - arxiv preprint arxiv:2102.01818, 2021 - arxiv.org
Public datasets are often used to evaluate the efficacy and generalizability of state-of-the-art
methods for many tasks in natural language processing (NLP). However, the presence of …

Knowlife: a versatile approach for constructing a large knowledge graph for biomedical sciences

P Ernst, A Siu, G Weikum - BMC bioinformatics, 2015 - Springer
Background Biomedical knowledge bases (KB's) have become important assets in life
sciences. Prior work on KB construction has three major limitations. First, most biomedical …

Micropublications: a semantic model for claims, evidence, arguments and annotations in biomedical communications

T Clark, PN Ciccarese, CA Goble - Journal of biomedical semantics, 2014 - Springer
Background Scientific publications are documentary representations of defeasible
arguments, supported by data and repeatable methods. They are the essential mediating …

[HTML][HTML] Text mining for the biocuration workflow

L Hirschman, GAP Burns, M Krallinger, C Arighi… - Database, 2012 - academic.oup.com
Molecular biology has become heavily dependent on biological knowledge encoded in
expert curated biological databases. As the volume of biological literature increases …