Opportunities and challenges for ChatGPT and large language models in biomedicine and health
ChatGPT has drawn considerable attention from both the general public and domain experts
with its remarkable text generation capabilities. This has subsequently led to the emergence …
with its remarkable text generation capabilities. This has subsequently led to the emergence …
A survey of knowledge enhanced pre-trained language models
Pre-trained Language Models (PLMs) which are trained on large text corpus via self-
supervised learning method, have yielded promising performance on various tasks in …
supervised learning method, have yielded promising performance on various tasks in …
Galactica: A large language model for science
R Taylor, M Kardas, G Cucurull, T Scialom… - ar**
downstream tasks. However, existing methods such as BERT model a single document, and …
downstream tasks. However, existing methods such as BERT model a single document, and …
Domain-specific language model pretraining for biomedical natural language processing
Pretraining large neural language models, such as BERT, has led to impressive gains on
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …
many natural language processing (NLP) tasks. However, most pretraining efforts focus on …
SciBERT: A pretrained language model for scientific text
Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging
and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin …
and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin …
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
Motivation Biomedical text mining is becoming increasingly important as the number of
biomedical documents rapidly grows. With the progress in natural language processing …
biomedical documents rapidly grows. With the progress in natural language processing …
S2ORC: The semantic scholar open research corpus
We introduce S2ORC, a large corpus of 81.1 M English-language academic papers
spanning many academic disciplines. The corpus consists of rich metadata, paper abstracts …
spanning many academic disciplines. The corpus consists of rich metadata, paper abstracts …
A survey on recent advances in named entity recognition from deep learning models
Named Entity Recognition (NER) is a key component in NLP systems for question
answering, information retrieval, relation extraction, etc. NER systems have been studied …
answering, information retrieval, relation extraction, etc. NER systems have been studied …
ScispaCy: fast and robust models for biomedical natural language processing
Despite recent advances in natural language processing, many statistical models for
processing text perform extremely poorly under domain shift. Processing biomedical and …
processing text perform extremely poorly under domain shift. Processing biomedical and …