A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

[HTML][HTML] A state-of-the-art survey on deep learning theory and architectures

MZ Alom, TM Taha, C Yakopcic, S Westberg, P Sidike… - electronics, 2019 - mdpi.com
In recent years, deep learning has garnered tremendous success in a variety of application
domains. This new field of machine learning has been growing rapidly and has been …

Graph contrastive learning with augmentations

Y You, T Chen, Y Sui, T Chen… - Advances in neural …, 2020 - proceedings.neurips.cc
Generalizable, transferrable, and robust representation learning on graph-structured data
remains a challenge for current graph neural networks (GNNs). Unlike what has been …

R-drop: Regularized dropout for neural networks

L Wu, J Li, Y Wang, Q Meng, T Qin… - Advances in …, 2021 - proceedings.neurips.cc
Dropout is a powerful and widely used technique to regularize the training of deep neural
networks. Though effective and performing well, the randomness introduced by dropout …

[PDF][PDF] Deep learning

I Goodfellow - 2016 - synapse.koreamed.org
An introduction to a broad range of topics in deep learning, covering mathematical and
conceptual background, deep learning techniques used in industry, and research …

Convolutional neural networks for medical image analysis: Full training or fine tuning?

N Tajbakhsh, JY Shin, SR Gurudu… - IEEE transactions on …, 2016 - ieeexplore.ieee.org
Training a deep convolutional neural network (CNN) from scratch is difficult because it
requires a large amount of labeled training data and a great deal of expertise to ensure …

Fitnets: Hints for thin deep nets

A Romero, N Ballas, SE Kahou, A Chassang… - arxiv preprint arxiv …, 2014 - arxiv.org
While depth tends to improve network performances, it also makes gradient-based training
more difficult since deeper networks tend to be more non-linear. The recently proposed …

The history began from alexnet: A comprehensive survey on deep learning approaches

MZ Alom, TM Taha, C Yakopcic, S Westberg… - arxiv preprint arxiv …, 2018 - arxiv.org
Deep learning has demonstrated tremendous success in variety of application domains in
the past few years. This new field of machine learning has been growing rapidly and applied …

[BOOK][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Understanding the difficulty of training deep feedforward neural networks

X Glorot, Y Bengio - Proceedings of the thirteenth …, 2010 - proceedings.mlr.press
Whereas before 2006 it appears that deep multi-layer neural networks were not successfully
trained, since then several algorithms have been shown to successfully train them, with …