On efficient training of large-scale deep learning models: A literature review

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - arxiv preprint arxiv …, 2023 - arxiv.org
The field of deep learning has witnessed significant progress, particularly in computer vision
(CV), natural language processing (NLP), and speech. The use of large-scale models …

Hyena hierarchy: Towards larger convolutional language models

M Poli, S Massaroli, E Nguyen, DY Fu… - International …, 2023 - proceedings.mlr.press
Recent advances in deep learning have relied heavily on the use of large Transformers due
to their ability to learn at scale. However, the core building block of Transformers, the …

Randomized numerical linear algebra: Foundations and algorithms

PG Martinsson, JA Tropp - Acta Numerica, 2020 - cambridge.org
This survey describes probabilistic algorithms for linear algebraic computations, such as
factorizing matrices and solving linear systems. It focuses on techniques that have a proven …

Parameter-efficient orthogonal finetuning via butterfly factorization

W Liu, Z Qiu, Y Feng, Y **u, Y Xue, L Yu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large foundation models are becoming ubiquitous, but training them from scratch is
prohibitively expensive. Thus, efficiently adapting these powerful models to downstream …

On Efficient Training of Large-Scale Deep Learning Models

L Shen, Y Sun, Z Yu, L Ding, X Tian, D Tao - ACM Computing Surveys, 2024 - dl.acm.org
The field of deep learning has witnessed significant progress in recent times, particularly in
areas such as computer vision (CV), natural language processing (NLP), and speech. The …

Pixelated butterfly: Simple and efficient sparse training for neural network models

T Dao, B Chen, K Liang, J Yang, Z Song… - arxiv preprint arxiv …, 2021 - arxiv.org
Overparameterized neural networks generalize well but are expensive to train. Ideally, one
would like to reduce their computational cost while retaining their generalization benefits …

SwitchNet: a neural network model for forward and inverse scattering problems

Y Khoo, L Ying - SIAM Journal on Scientific Computing, 2019 - SIAM
We propose a novel neural network architecture, SwitchNet, for solving wave equation
based inverse scattering problems via providing maps between the scatterers and the …

Learning fast algorithms for linear transforms using butterfly factorizations

T Dao, A Gu, M Eichhorn, A Rudra… - … conference on machine …, 2019 - proceedings.mlr.press
Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier
transform, discrete cosine transform, and other structured transformations such as …

Boundary work

M Carlson, SC Lewis - The handbook of journalism studies, 2019 - taylorfrancis.com
This chapter offers a state-of-the-art analysis of boundary work and journalism. Physical
boundaries dictate how space is understood and creates complex impediments. Boundary …

A butterfly-based direct integral-equation solver using hierarchical LU factorization for analyzing scattering from electrically large conducting objects

H Guo, Y Liu, J Hu, E Michielssen - IEEE Transactions on …, 2017 - ieeexplore.ieee.org
A butterfly-based direct combined-field integral-equation (CFIE) solver for analyzing
scattering from electrically large, perfect electrically conducting objects is presented. The …