A survey on clinical natural language processing in the United Kingdom from 2007 to 2022

H Wu, M Wang, J Wu, F Francis, YH Chang… - NPJ digital …, 2022 - nature.com
Much of the knowledge and information needed for enabling high-quality clinical research is
stored in free-text format. Natural language processing (NLP) has been used to extract …

An overview of biomedical entity linking throughout the years

E French, BT McInnes - Journal of biomedical informatics, 2023 - Elsevier
Abstract Biomedical Entity Linking (BEL) is the task of map** of spans of text within
biomedical documents to normalized, unique identifiers within an ontology. This is an …

Self-alignment pretraining for biomedical entity representations

F Liu, E Shareghi, Z Meng, M Basaldella… - arxiv preprint arxiv …, 2020 - arxiv.org
Despite the widespread success of self-supervised learning via masked language models
(MLM), accurately capturing fine-grained semantic relationships in the biomedical domain …

BioBART: Pretraining and evaluation of a biomedical generative language model

H Yuan, Z Yuan, R Gan, J Zhang, Y **e… - arxiv preprint arxiv …, 2022 - arxiv.org
Pretrained language models have served as important backbones for natural language
processing. Recently, in-domain pretraining has been shown to benefit various domain …

Benchmarking intersectional biases in NLP

JP Lalor, Y Yang, K Smith, N Forsgren… - Proceedings of the …, 2022 - aclanthology.org
There has been a recent wave of work assessing the fairness of machine learning models in
general, and more specifically, on natural language processing (NLP) models built using …

Fast, effective, and self-supervised: Transforming masked language models into universal lexical and sentence encoders

F Liu, I Vulić, A Korhonen, N Collier - arxiv preprint arxiv:2104.08027, 2021 - arxiv.org
Pretrained Masked Language Models (MLMs) have revolutionised NLP in recent years.
However, previous work has indicated that off-the-shelf MLMs are not effective as universal …

[HTML][HTML] CODER: Knowledge-infused cross-lingual medical term embedding for term normalization

Z Yuan, Z Zhao, H Sun, J Li, F Wang, S Yu - Journal of biomedical …, 2022 - Elsevier
Objective This paper aims to propose knowledge-aware embedding, a critical tool for
medical term normalization. Methods We develop CODER (Cross-lingual knowledge …

COMETA: A corpus for medical entity linking in the social media

M Basaldella, F Liu, E Shareghi, N Collier - arxiv preprint arxiv …, 2020 - arxiv.org
Whilst there has been growing progress in Entity Linking (EL) for general language, existing
datasets fail to address the complex nature of health terminology in layman's language …

Building and using personal knowledge graph to improve suicidal ideation detection on social media

L Cao, H Zhang, L Feng - IEEE Transactions on Multimedia, 2020 - ieeexplore.ieee.org
A large number of individuals are suffering from suicidal ideation in the world. There are a
number of causes behind why an individual might suffer from suicidal ideation. As the most …

Data evaluation and enhancement for quality improvement of machine learning

H Chen, J Chen, J Ding - IEEE Transactions on Reliability, 2021 - ieeexplore.ieee.org
Poor data quality has a direct impact on the performance of the machine learning system
that is built on the data. As a demonstrated effective approach for data quality improvement …