A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Pre-trained models for natural language processing: A survey

X Qiu, T Sun, Y Xu, Y Shao, N Dai, X Huang - Science China …, 2020 - Springer
Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …

Active prompting with chain-of-thought for large language models

S Diao, P Wang, Y Lin, R Pan, X Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
The increasing scale of large language models (LLMs) brings emergent abilities to various
complex tasks requiring reasoning, such as arithmetic and commonsense reasoning. It is …

Generating radiology reports via memory-driven transformer

Z Chen, Y Song, TH Chang, X Wan - arxiv preprint arxiv:2010.16056, 2020 - arxiv.org
Medical imaging is frequently used in clinical practice and trials for diagnosis and treatment.
Writing imaging reports is time-consuming and can be error-prone for inexperienced …

N-gram in swin transformers for efficient lightweight image super-resolution

H Choi, J Lee, J Yang - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
While some studies have proven that Swin Transformer (Swin) with window self-attention
(WSA) is suitable for single image super-resolution (SR), the plain WSA ignores the broad …

Cpt: A pre-trained unbalanced transformer for both chinese language understanding and generation

Y Shao, Z Geng, Y Liu, J Dai, H Yan, F Yang… - Science China …, 2024 - Springer
In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a
novel Chinese pre-trained unbalanced transformer (CPT). Different from previous Chinese …

Sparse invariant risk minimization

X Zhou, Y Lin, W Zhang… - … Conference on Machine …, 2022 - proceedings.mlr.press
Abstract Invariant Risk Minimization (IRM) is an emerging invariant feature extracting
technique to help generalization with distributional shift. However, we find that there exists a …

Automatic prompt augmentation and selection with chain-of-thought from labeled data

KS Shum, S Diao, T Zhang - arxiv preprint arxiv:2302.12822, 2023 - arxiv.org
Chain-of-thought prompting (CoT) advances the reasoning abilities of large language
models (LLMs) and achieves superior performance in arithmetic, commonsense, and …

Cblue: A chinese biomedical language understanding evaluation benchmark

N Zhang, M Chen, Z Bi, X Liang, L Li, X Shang… - arxiv preprint arxiv …, 2021 - arxiv.org
Artificial Intelligence (AI), along with the recent progress in biomedical language
understanding, is gradually changing medical practice. With the development of biomedical …

Lexicon enhanced Chinese sequence labeling using BERT adapter

W Liu, X Fu, Y Zhang, W **ao - arxiv preprint arxiv:2105.07148, 2021 - arxiv.org
Lexicon information and pre-trained models, such as BERT, have been combined to explore
Chinese sequence labelling tasks due to their respective strengths. However, existing …