A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
Pre-trained models for natural language processing: A survey
Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …
Active prompting with chain-of-thought for large language models
The increasing scale of large language models (LLMs) brings emergent abilities to various
complex tasks requiring reasoning, such as arithmetic and commonsense reasoning. It is …
complex tasks requiring reasoning, such as arithmetic and commonsense reasoning. It is …
Generating radiology reports via memory-driven transformer
Medical imaging is frequently used in clinical practice and trials for diagnosis and treatment.
Writing imaging reports is time-consuming and can be error-prone for inexperienced …
Writing imaging reports is time-consuming and can be error-prone for inexperienced …
N-gram in swin transformers for efficient lightweight image super-resolution
While some studies have proven that Swin Transformer (Swin) with window self-attention
(WSA) is suitable for single image super-resolution (SR), the plain WSA ignores the broad …
(WSA) is suitable for single image super-resolution (SR), the plain WSA ignores the broad …
Cpt: A pre-trained unbalanced transformer for both chinese language understanding and generation
In this paper, we take the advantage of previous pre-trained models (PTMs) and propose a
novel Chinese pre-trained unbalanced transformer (CPT). Different from previous Chinese …
novel Chinese pre-trained unbalanced transformer (CPT). Different from previous Chinese …
Sparse invariant risk minimization
Abstract Invariant Risk Minimization (IRM) is an emerging invariant feature extracting
technique to help generalization with distributional shift. However, we find that there exists a …
technique to help generalization with distributional shift. However, we find that there exists a …
Automatic prompt augmentation and selection with chain-of-thought from labeled data
Chain-of-thought prompting (CoT) advances the reasoning abilities of large language
models (LLMs) and achieves superior performance in arithmetic, commonsense, and …
models (LLMs) and achieves superior performance in arithmetic, commonsense, and …
Cblue: A chinese biomedical language understanding evaluation benchmark
Artificial Intelligence (AI), along with the recent progress in biomedical language
understanding, is gradually changing medical practice. With the development of biomedical …
understanding, is gradually changing medical practice. With the development of biomedical …
Lexicon enhanced Chinese sequence labeling using BERT adapter
Lexicon information and pre-trained models, such as BERT, have been combined to explore
Chinese sequence labelling tasks due to their respective strengths. However, existing …
Chinese sequence labelling tasks due to their respective strengths. However, existing …