A comprehensive survey on evaluating large language model applications in the medical industry
Y Huang, K Tang, M Chen, B Wang - arxiv preprint arxiv:2404.15777, 2024 - arxiv.org
Since the inception of the Transformer architecture in 2017, Large Language Models (LLMs)
such as GPT and BERT have evolved significantly, impacting various industries with their …
such as GPT and BERT have evolved significantly, impacting various industries with their …
[HTML][HTML] A comprehensive evaluation of large language models on benchmark biomedical text processing tasks
Abstract Recently, Large Language Models (LLMs) have demonstrated impressive
capability to solve a wide range of tasks. However, despite their success across various …
capability to solve a wide range of tasks. However, despite their success across various …
ChatGPT vs human-authored text: Insights into controllable text summarization and sentence style transfer
D Pu, V Demberg - arxiv preprint arxiv:2306.07799, 2023 - arxiv.org
Large-scale language models, like ChatGPT, have garnered significant media attention and
stunned the public with their remarkable capacity for generating coherent text from short …
stunned the public with their remarkable capacity for generating coherent text from short …
Overview of the biolaysumm 2024 shared task on the lay summarization of biomedical research articles
This paper presents the setup and results of the second edition of the BioLaySumm shared
task on the Lay Summarisation of Biomedical Research Articles, hosted at the BioNLP …
task on the Lay Summarisation of Biomedical Research Articles, hosted at the BioNLP …
Factkb: Generalizable factuality evaluation using language models enhanced with factual knowledge
Evaluating the factual consistency of automatically generated summaries is essential for the
progress and adoption of reliable summarization systems. Despite recent advances, existing …
progress and adoption of reliable summarization systems. Despite recent advances, existing …
[HTML][HTML] Ascle—a Python natural language processing toolkit for medical text generation: development and evaluation study
Background Medical texts present significant domain-specific challenges, and manually
curating these texts is a time-consuming and labor-intensive process. To address this …
curating these texts is a time-consuming and labor-intensive process. To address this …
MeetingBank: A benchmark dataset for meeting summarization
As the number of recorded meetings increases, it becomes increasingly important to utilize
summarization technology to create useful summaries of these recordings. However, there is …
summarization technology to create useful summaries of these recordings. However, there is …
Retrieval augmentation of large language models for lay language generation
The complex linguistic structures and specialized terminology of expert-authored content
limit the accessibility of biomedical literature to the general public. Automated methods have …
limit the accessibility of biomedical literature to the general public. Automated methods have …
Improving biomedical abstractive summarisation with knowledge aggregation from citation papers
Abstracts derived from biomedical literature possess distinct domain-specific characteristics,
including specialised writing styles and biomedical terminologies, which necessitate a deep …
including specialised writing styles and biomedical terminologies, which necessitate a deep …
Language model as an annotator: Unsupervised context-aware quality phrase generation
Phrase mining is a fundamental text mining task that aims to identify quality phrases from
context. Nevertheless, the scarcity of extensive gold labels datasets, demanding substantial …
context. Nevertheless, the scarcity of extensive gold labels datasets, demanding substantial …