Structured pruning for deep convolutional neural networks: A survey
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …
attributed to their deeper and wider architectures, which can come with significant …
Adaptive proximal gradient methods for structured neural networks
We consider the training of structured neural networks where the regularizer can be non-
smooth and possibly non-convex. While popular machine learning libraries have resorted to …
smooth and possibly non-convex. While popular machine learning libraries have resorted to …
Nonlinear functional modeling using neural networks
We introduce a new class of nonlinear models for functional data based on neural networks.
Deep learning has been very successful in nonlinear modeling, but there has been little …
Deep learning has been very successful in nonlinear modeling, but there has been little …
Compressed decentralized proximal stochastic gradient method for nonconvex composite problems with heterogeneous data
We first propose a decentralized proximal stochastic gradient tracking method (DProxSGT)
for nonconvex stochastic composite problems, with data heterogeneously distributed on …
for nonconvex stochastic composite problems, with data heterogeneously distributed on …
A Bregman learning framework for sparse neural networks
We propose a learning framework based on stochastic Bregman iterations, also known as
mirror descent, to train sparse neural networks with an inverse scale space approach. We …
mirror descent, to train sparse neural networks with an inverse scale space approach. We …
Riemannian low-rank model compression for federated learning with over-the-air aggregation
Low-rank model compression is a widely used technique for reducing the computational
load when training machine learning models. However, existing methods often rely on …
load when training machine learning models. However, existing methods often rely on …
Structured sparsity inducing adaptive optimizers for deep learning
The parameters of a neural network are naturally organized in groups, some of which might
not contribute to its overall performance. To prune out unimportant groups of parameters, we …
not contribute to its overall performance. To prune out unimportant groups of parameters, we …
An inexact augmented lagrangian algorithm for training leaky ReLU neural network with group sparsity
The leaky ReLU network with a group sparse regularization term has been widely used in
the recent years. However, training such network yields a nonsmooth nonconvex …
the recent years. However, training such network yields a nonsmooth nonconvex …
Training structured neural networks through manifold identification and variance reduction
ZS Huang, C Lee - arxiv preprint arxiv:2112.02612, 2021 - arxiv.org
This paper proposes an algorithm (RMDA) for training neural networks (NNs) with a
regularization term for promoting desired structures. RMDA does not incur computation …
regularization term for promoting desired structures. RMDA does not incur computation …
A general family of stochastic proximal gradient methods for deep learning
We study the training of regularized neural networks where the regularizer can be non-
smooth and non-convex. We propose a unified framework for stochastic proximal gradient …
smooth and non-convex. We propose a unified framework for stochastic proximal gradient …