Structured pruning for deep convolutional neural networks: A survey

Y He, L **ao - IEEE transactions on pattern analysis and …, 2023 - ieeexplore.ieee.org
The remarkable performance of deep Convolutional neural networks (CNNs) is generally
attributed to their deeper and wider architectures, which can come with significant …

Adaptive proximal gradient methods for structured neural networks

J Yun, AC Lozano, E Yang - Advances in Neural …, 2021 - proceedings.neurips.cc
We consider the training of structured neural networks where the regularizer can be non-
smooth and possibly non-convex. While popular machine learning libraries have resorted to …

Nonlinear functional modeling using neural networks

AR Rao, M Reimherr - Journal of Computational and Graphical …, 2023 - Taylor & Francis
We introduce a new class of nonlinear models for functional data based on neural networks.
Deep learning has been very successful in nonlinear modeling, but there has been little …

Compressed decentralized proximal stochastic gradient method for nonconvex composite problems with heterogeneous data

Y Yan, J Chen, PY Chen, X Cui… - … on Machine Learning, 2023 - proceedings.mlr.press
We first propose a decentralized proximal stochastic gradient tracking method (DProxSGT)
for nonconvex stochastic composite problems, with data heterogeneously distributed on …

A Bregman learning framework for sparse neural networks

L Bungert, T Roith, D Tenbrinck, M Burger - Journal of Machine Learning …, 2022 - jmlr.org
We propose a learning framework based on stochastic Bregman iterations, also known as
mirror descent, to train sparse neural networks with an inverse scale space approach. We …

Riemannian low-rank model compression for federated learning with over-the-air aggregation

Y Xue, V Lau - IEEE Transactions on Signal Processing, 2023 - ieeexplore.ieee.org
Low-rank model compression is a widely used technique for reducing the computational
load when training machine learning models. However, existing methods often rely on …

Structured sparsity inducing adaptive optimizers for deep learning

T Deleu, Y Bengio - arxiv preprint arxiv:2102.03869, 2021 - arxiv.org
The parameters of a neural network are naturally organized in groups, some of which might
not contribute to its overall performance. To prune out unimportant groups of parameters, we …

An inexact augmented lagrangian algorithm for training leaky ReLU neural network with group sparsity

W Liu, X Liu, X Chen - Journal of Machine Learning Research, 2023 - jmlr.org
The leaky ReLU network with a group sparse regularization term has been widely used in
the recent years. However, training such network yields a nonsmooth nonconvex …

Training structured neural networks through manifold identification and variance reduction

ZS Huang, C Lee - arxiv preprint arxiv:2112.02612, 2021 - arxiv.org
This paper proposes an algorithm (RMDA) for training neural networks (NNs) with a
regularization term for promoting desired structures. RMDA does not incur computation …

A general family of stochastic proximal gradient methods for deep learning

J Yun, AC Lozano, E Yang - arxiv preprint arxiv:2007.07484, 2020 - arxiv.org
We study the training of regularized neural networks where the regularizer can be non-
smooth and non-convex. We propose a unified framework for stochastic proximal gradient …