Ac/dc: Alternating compressed/decompressed training of deep neural networks

A Peste, E Iofinova, A Vladu… - Advances in neural …, 2021 - proceedings.neurips.cc
The increasing computational requirements of deep neural networks (DNNs) have led to
significant interest in obtaining DNN models that are sparse, yet accurate. Recent work has …

Proving linear mode connectivity of neural networks via optimal transport

D Ferbach, B Goujaud, G Gidel… - International …, 2024 - proceedings.mlr.press
The energy landscape of high-dimensional non-convex optimization problems is crucial to
understanding the effectiveness of modern deep neural network architectures. Recent works …

Deep model fusion: A survey

W Li, Y Peng, M Zhang, L Ding, H Hu… - arxiv preprint arxiv …, 2023 - arxiv.org
Deep model fusion/merging is an emerging technique that merges the parameters or
predictions of multiple deep learning models into a single one. It combines the abilities of …

A rigorous framework for the mean field limit of multilayer neural networks

PM Nguyen, HT Pham - Mathematical Statistics and Learning, 2023 - ems.press
We develop a mathematically rigorous framework for multilayer neural networks in the mean
field regime. As the network's widths increase, the network's learning trajectory is shown to …

Progress toward favorable landscapes in quantum combinatorial optimization

J Lee, AB Magann, HA Rabitz, C Arenz - Physical Review A, 2021 - APS
The performance of variational quantum algorithms relies on the success of using quantum
and classical computing resources in tandem. Here, we study how these quantum and …

Deep networks on toroids: removing symmetries reveals the structure of flat regions in the landscape geometry

F Pittorino, A Ferraro, G Perugini… - International …, 2022 - proceedings.mlr.press
We systematize the approach to the investigation of deep neural network landscapes by
basing it on the geometry of the space of implemented functions rather than the space of …

Taxonomizing local versus global structure in neural network loss landscapes

Y Yang, L Hodgkinson, R Theisen… - Advances in …, 2021 - proceedings.neurips.cc
Viewing neural network models in terms of their loss landscapes has a long history in the
statistical mechanics approach to learning, and in recent years it has received attention …

On quantum speedups for nonconvex optimization via quantum tunneling walks

Y Liu, WJ Su, T Li - Quantum, 2023 - quantum-journal.org
Classical algorithms are often not effective for solving nonconvex optimization problems
where local minima are separated by high barriers. In this paper, we explore possible …

Redundant representations help generalization in wide neural networks

D Doimo, A Glielmo, S Goldt… - Advances in Neural …, 2022 - proceedings.neurips.cc
Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters
to a DNN that interpolates its training data will typically improve its generalization …

Analyzing monotonic linear interpolation in neural network loss landscapes

J Lucas, J Bae, MR Zhang, S Fort, R Zemel… - arxiv preprint arxiv …, 2021 - arxiv.org
Linear interpolation between initial neural network parameters and converged parameters
after training with stochastic gradient descent (SGD) typically leads to a monotonic decrease …