A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

[HTML][HTML] Few-shot learning for medical text: A review of advances, trends, and opportunities

Y Ge, Y Guo, S Das, MA Al-Garadi, A Sarker - Journal of Biomedical …, 2023 - Elsevier
Background: Few-shot learning (FSL) is a class of machine learning methods that require
small numbers of labeled instances for training. With many medical topics having limited …

P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks

X Liu, K Ji, Y Fu, WL Tam, Z Du, Z Yang… - arxiv preprint arxiv …, 2021 - arxiv.org
Prompt tuning, which only tunes continuous prompts with a frozen language model,
substantially reduces per-task storage and memory usage at training. However, in the …

Unified named entity recognition as word-word relation classification

J Li, H Fei, J Liu, S Wu, M Zhang, C Teng… - proceedings of the AAAI …, 2022 - ojs.aaai.org
So far, named entity recognition (NER) has been involved with three major types, including
flat, overlapped (aka. nested), and discontinuous NER, which have mostly been studied …

[HTML][HTML] Ptr: Prompt tuning with rules for text classification

X Han, W Zhao, N Ding, Z Liu, M Sun - AI Open, 2022 - Elsevier
Recently, prompt tuning has been widely applied to stimulate the rich knowledge in pre-
trained language models (PLMs) to serve NLP tasks. Although prompt tuning has achieved …

Ernie 3.0: Large-scale knowledge enhanced pre-training for language understanding and generation

Y Sun, S Wang, S Feng, S Ding, C Pang… - arxiv preprint arxiv …, 2021 - arxiv.org
Pre-trained models have achieved state-of-the-art results in various Natural Language
Processing (NLP) tasks. Recent works such as T5 and GPT-3 have shown that scaling up …

Multi-task learning with deep neural networks: A survey

M Crawshaw - arxiv preprint arxiv:2009.09796, 2020 - arxiv.org
Multi-task learning (MTL) is a subfield of machine learning in which multiple tasks are
simultaneously learned by a shared model. Such approaches offer advantages like …

Large language model is not a good few-shot information extractor, but a good reranker for hard samples!

Y Ma, Y Cao, YC Hong, A Sun - arxiv preprint arxiv:2303.08559, 2023 - arxiv.org
Large Language Models (LLMs) have made remarkable strides in various tasks. Whether
LLMs are competitive few-shot solvers for information extraction (IE) tasks, however, remains …

[PDF][PDF] BERT rediscovers the classical NLP pipeline

I Tenney - arxiv preprint arxiv:1905.05950, 2019 - fq.pkwyx.com
Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We
focus on one such model, BERT, and aim to quantify where linguistic information is captured …

FLAT: Chinese NER using flat-lattice transformer

X Li, H Yan, X Qiu, X Huang - arxiv preprint arxiv:2004.11795, 2020 - arxiv.org
Recently, the character-word lattice structure has been proved to be effective for Chinese
named entity recognition (NER) by incorporating the word information. However, since the …