Pre-trained language models in biomedical domain: A systematic survey

B Wang, Q **e, J Pei, Z Chen, P Tiwari, Z Li… - ACM Computing …, 2023 - dl.acm.org
Pre-trained language models (PLMs) have been the de facto paradigm for most natural
language processing tasks. This also benefits the biomedical domain: researchers from …

BioRED: a rich biomedical relation extraction dataset

L Luo, PT Lai, CH Wei, CN Arighi… - Briefings in …, 2022 - academic.oup.com
Automated relation extraction (RE) from biomedical literature is critical for many downstream
text mining applications in both research and real-world settings. However, most existing …

[PDF][PDF] Galactica: A large language model for science

R Taylor, M Kardas, G Cucurull, T Scialom… - arxiv preprint arxiv …, 2022 - galactica.org
Abstract Information overload is a major obstacle to scientific progress. The explosive growth
in scientific literature and data has made it ever harder to discover useful insights in a large …

A knowledge graph to interpret clinical proteomics data

A Santos, AR Colaço, AB Nielsen, L Niu… - Nature …, 2022 - nature.com
Implementing precision medicine hinges on the integration of omics data, such as
proteomics, into the clinical decision-making process, but the quantity and diversity of …

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

J Lee, W Yoon, S Kim, D Kim, S Kim, CH So… - …, 2020 - academic.oup.com
Motivation Biomedical text mining is becoming increasingly important as the number of
biomedical documents rapidly grows. With the progress in natural language processing …

miRBase: from microRNA sequences to function

A Kozomara, M Birgaoanu… - Nucleic acids …, 2019 - academic.oup.com
Abstract miRBase catalogs, names and distributes microRNA gene sequences. The latest
release of miRBase (v22) contains microRNA sequences from 271 organisms: 38 589 …

[HTML][HTML] A comprehensive evaluation of large language models on benchmark biomedical text processing tasks

I Jahan, MTR Laskar, C Peng, JX Huang - Computers in biology and …, 2024 - Elsevier
Abstract Recently, Large Language Models (LLMs) have demonstrated impressive
capability to solve a wide range of tasks. However, despite their success across various …

PubTator 3.0: an AI-powered literature resource for unlocking biomedical knowledge

CH Wei, A Allot, PT Lai, R Leaman, S Tian… - Nucleic Acids …, 2024 - academic.oup.com
Abstract PubTator 3.0 (https://www. ncbi. nlm. nih. gov/research/pubtator3/) is a biomedical
literature resource using state-of-the-art AI techniques to offer semantic and relation …

Scifive: a text-to-text transformer model for biomedical literature

LN Phan, JT Anibal, H Tran, S Chanana… - arxiv preprint arxiv …, 2021 - arxiv.org
In this report, we introduce SciFive, a domain-specific T5 model that has been pre-trained on
large biomedical corpora. Our model outperforms the current SOTA methods (ie BERT …

Deep learning with word embeddings improves biomedical named entity recognition

M Habibi, L Weber, M Neves, DL Wiegandt… - …, 2017 - academic.oup.com
Motivation Text mining has become an important tool for biomedical research. The most
fundamental text-mining task is the recognition of biomedical named entities (NER), such as …