Ammus: A survey of transformer-based pretrained models in natural language processing
KS Kalyan, A Rajasekharan, S Sangeetha - arxiv preprint arxiv …, 2021 - arxiv.org
Transformer-based pretrained language models (T-PTLMs) have achieved great success in
almost every NLP task. The evolution of these models started with GPT and BERT. These …
almost every NLP task. The evolution of these models started with GPT and BERT. These …
A systematic review of transformer-based pre-trained language models through self-supervised learning
Transfer learning is a technique utilized in deep learning applications to transmit learned
inference to a different target domain. The approach is mainly to solve the problem of a few …
inference to a different target domain. The approach is mainly to solve the problem of a few …
Efficient large language models: A survey
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …
tasks such as natural language understanding and language generation, and thus have the …
On transferability of prompt tuning for natural language processing
Prompt tuning (PT) is a promising parameter-efficient method to utilize extremely large pre-
trained language models (PLMs), which can achieve comparable performance to full …
trained language models (PLMs), which can achieve comparable performance to full …
[HTML][HTML] Cpm-2: Large-scale cost-effective pre-trained language models
In recent years, the size of pre-trained language models (PLMs) has grown by leaps and
bounds. However, efficiency issues of these large-scale PLMs limit their utilization in real …
bounds. However, efficiency issues of these large-scale PLMs limit their utilization in real …
A survey of resource-efficient llm and multimodal foundation models
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
bert2bert: Towards reusable pretrained language models
In recent years, researchers tend to pre-train ever-larger language models to explore the
upper limit of deep models. However, large language model pre-training costs intensive …
upper limit of deep models. However, large language model pre-training costs intensive …
ELLE: Efficient lifelong pre-training for emerging data
Current pre-trained language models (PLM) are typically trained with static data, ignoring
that in real-world scenarios, streaming data of various sources may continuously grow. This …
that in real-world scenarios, streaming data of various sources may continuously grow. This …
Learning to grow pretrained models for efficient transformer training
Scaling transformers has led to significant breakthroughs in many domains, leading to a
paradigm in which larger versions of existing models are trained and released on a periodic …
paradigm in which larger versions of existing models are trained and released on a periodic …
Cross-lingual consistency of factual knowledge in multilingual language models
Multilingual large-scale Pretrained Language Models (PLMs) have been shown to store
considerable amounts of factual knowledge, but large variations are observed across …
considerable amounts of factual knowledge, but large variations are observed across …