- Academic Search

T Wu, S He, J Liu, S Sun, K Liu… - IEEE/CAA Journal of …, 2023 - ieeexplore.ieee.org

ChatGPT, an artificial intelligence generated content (AIGC) model developed by OpenAI,
has attracted world-wide attention for its capability of dealing with challenging language …

Save Cite Cited by 1161 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer

Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Save Cite Cited by 611 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Llm-pruner: On the structural pruning of large language models

X Ma, G Fang, X Wang - Advances in neural information …, 2023 - proceedings.neurips.cc

Large language models (LLMs) have shown remarkable capabilities in language
understanding and generation. However, such impressive capability typically comes with a …

Save Cite Cited by 494 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] baai.ac.cn

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Save Cite Cited by 2678 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Knowledge distillation: A survey

J Gou, B Yu, SJ Maybank, D Tao - International Journal of Computer Vision, 2021 - Springer

In recent years, deep neural networks have been successful in both industry and academia,
especially for computer vision tasks. The great success of deep learning is mainly due to its …

Save Cite Cited by 3317 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Zeroquant: Efficient and affordable post-training quantization for large-scale transformers

Z Yao, R Yazdani Aminabadi… - Advances in …, 2022 - proceedings.neurips.cc

How to efficiently serve ever-larger trained natural language models in practice has become
exceptionally challenging even for powerful cloud servers due to their prohibitive …

Save Cite Cited by 390 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Pre-trained models: Past, present and future

X Han, Z Zhang, N Ding, Y Gu, X Liu, Y Huo, J Qiu… - AI Open, 2021 - Elsevier

Large-scale pre-trained models (PTMs) such as BERT and GPT have recently achieved
great success and become a milestone in the field of artificial intelligence (AI). Owing to …

Save Cite Cited by 931 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Pre-trained models for natural language processing: A survey

X Qiu, T Sun, Y Xu, Y Shao, N Dai, X Huang - Science China …, 2020 - Springer

Recently, the emergence of pre-trained models (PTMs) has brought natural language
processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs …

Save Cite Cited by 1930 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] mit.edu

A primer in BERTology: What we know about how BERT works

A Rogers, O Kovaleva, A Rumshisky - Transactions of the Association …, 2021 - direct.mit.edu

Transformer-based models have pushed state of the art in many areas of NLP, but our
understanding of what is behind their success is still limited. This paper is the first survey of …

Save Cite Cited by 1826 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Tinybert: Distilling bert for natural language understanding

X Jiao, Y Yin, L Shang, X Jiang, X Chen, L Li… - arxiv preprint arxiv …, 2019 - arxiv.org

Language model pre-training, such as BERT, has significantly improved the performances of
many natural language processing tasks. However, pre-trained language models are …

Save Cite Cited by 2033 Related articles All 4 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Patient knowledge distillation for bert model compression

A brief overview of ChatGPT: The history, status quo and potential future development

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

Llm-pruner: On the structural pruning of large language models

A survey on vision transformer

Knowledge distillation: A survey

Zeroquant: Efficient and affordable post-training quantization for large-scale transformers

[HTML][HTML] Pre-trained models: Past, present and future

Pre-trained models for natural language processing: A survey

A primer in BERTology: What we know about how BERT works

Tinybert: Distilling bert for natural language understanding