A comprehensive overview of large language models
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …
natural language processing tasks and beyond. This success of LLMs has led to a large …
[HTML][HTML] Data augmentation: A comprehensive survey of modern approaches
A Mumuni, F Mumuni - Array, 2022 - Elsevier
To ensure good performance, modern machine learning models typically require large
amounts of quality annotated data. Meanwhile, the data collection and annotation processes …
amounts of quality annotated data. Meanwhile, the data collection and annotation processes …
Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
The growing energy and performance costs of deep learning have driven the community to
reduce the size of neural networks by selectively pruning components. Similarly to their …
reduce the size of neural networks by selectively pruning components. Similarly to their …
A systematic review on overfitting control in shallow and deep neural networks
Shallow neural networks process the features directly, while deep networks extract features
automatically along with the training. Both models suffer from overfitting or poor …
automatically along with the training. Both models suffer from overfitting or poor …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Hippo: Recurrent memory with optimal polynomial projections
A central problem in learning from sequential data is representing cumulative history in an
incremental fashion as more data is processed. We introduce a general framework (HiPPO) …
incremental fashion as more data is processed. We introduce a general framework (HiPPO) …
Prophetnet: Predicting future n-gram for sequence-to-sequence pre-training
This paper presents a new sequence-to-sequence pre-training model called ProphetNet,
which introduces a novel self-supervised objective named future n-gram prediction and the …
which introduces a novel self-supervised objective named future n-gram prediction and the …
Dropblock: A regularization method for convolutional networks
Deep neural networks often work well when they are over-parameterized and trained with a
massive amount of noise and regularization, such as weight decay and dropout. Although …
massive amount of noise and regularization, such as weight decay and dropout. Although …
An empirical evaluation of generic convolutional and recurrent networks for sequence modeling
For most deep learning practitioners, sequence modeling is synonymous with recurrent
networks. Yet recent results indicate that convolutional architectures can outperform …
networks. Yet recent results indicate that convolutional architectures can outperform …
Neural architecture optimization
Automatic neural architecture design has shown its potential in discovering powerful neural
network architectures. Existing methods, no matter based on reinforcement learning or …
network architectures. Existing methods, no matter based on reinforcement learning or …