A comprehensive overview and comparative analysis on deep learning models: CNN, RNN, LSTM, GRU

FM Shiri, T Perumal, N Mustapha… - arxiv preprint arxiv …, 2023 - arxiv.org
Deep learning (DL) has emerged as a powerful subset of machine learning (ML) and
artificial intelligence (AI), outperforming traditional ML methods, especially in handling …

On efficient training of large-scale deep learning models: A literature review

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arxiv preprint arxiv …, 2023 - arxiv.org
The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arxiv preprint arxiv …, 2023 - arxiv.org
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

Sigmoid loss for language image pre-training

X Zhai, B Mustafa, A Kolesnikov… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a simple pairwise sigmoid loss for image-text pre-training. Unlike standard
contrastive learning with softmax normalization, the sigmoid loss operates solely on image …

Mathematical discoveries from program search with large language models

B Romera-Paredes, M Barekatain, A Novikov, M Balog… - Nature, 2024 - nature.com
Large language models (LLMs) have demonstrated tremendous capabilities in solving
complex tasks, from quantitative reasoning to understanding natural language. However …

Sheared llama: Accelerating language model pre-training via structured pruning

M **a, T Gao, Z Zeng, D Chen - arxiv preprint arxiv:2310.06694, 2023 - arxiv.org
The popularity of LLaMA (Touvron et al., 2023a; b) and other recently emerged moderate-
sized large language models (LLMs) highlights the potential of building smaller yet powerful …

Eva-02: A visual representation for neon genesis

Y Fang, Q Sun, X Wang, T Huang, X Wang… - Image and Vision …, 2024 - Elsevier
We launch EVA-02, a next-generation Transformer-based visual representation pre-trained
to reconstruct strong and robust language-aligned vision features via masked image …

Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models

P Xu, W Shao, K Zhang, P Gao, S Liu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Large Vision-Language Models (LVLMs) have recently played a dominant role in
multimodal vision-language learning. Despite the great success, it lacks a holistic evaluation …

Automated model building and protein identification in cryo-EM maps

K Jamali, L Käll, R Zhang, A Brown, D Kimanius… - Nature, 2024 - nature.com
Interpreting electron cryo-microscopy (cryo-EM) maps with atomic models requires high
levels of expertise and labour-intensive manual intervention in three-dimensional computer …

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …