Model complexity of deep learning: A survey

X Hu, L Chu, J Pei, W Liu, J Bian - Knowledge and Information Systems, 2021 - Springer
Abstract Model complexity is a fundamental problem in deep learning. In this paper, we
conduct a systematic overview of the latest studies on model complexity in deep learning …

Optimization problems for machine learning: A survey

C Gambella, B Ghaddar, J Naoum-Sawaya - European Journal of …, 2021 - Elsevier
This paper surveys the machine learning literature and presents in an optimization
framework several commonly used machine learning approaches. Particularly …

Neural architecture search on imagenet in four gpu hours: A theoretically inspired perspective

W Chen, X Gong, Z Wang - arxiv preprint arxiv:2102.11535, 2021 - arxiv.org
Neural Architecture Search (NAS) has been explosively studied to automate the discovery of
top-performer neural networks. Current works require heavy training of supernet or intensive …

Liquid time-constant networks

R Hasani, M Lechner, A Amini, D Rus… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
We introduce a new class of time-continuous recurrent neural network models. Instead of
declaring a learning system's dynamics by implicit nonlinearities, we construct networks of …

Understanding deep neural networks with rectified linear units

R Arora, A Basu, P Mianjy, A Mukherjee - arxiv preprint arxiv:1611.01491, 2016 - arxiv.org
In this paper we investigate the family of functions representable by deep neural networks
(DNN) with rectified linear units (ReLU). We give an algorithm to train a ReLU DNN with one …

Strong mixed-integer programming formulations for trained neural networks

R Anderson, J Huchette, W Ma… - Mathematical …, 2020 - Springer
We present strong mixed-integer programming (MIP) formulations for high-dimensional
piecewise linear functions that correspond to trained neural networks. These formulations …

Deep relu networks have surprisingly few activation patterns

B Hanin, D Rolnick - Advances in neural information …, 2019 - proceedings.neurips.cc
The success of deep networks has been attributed in part to their expressivity: per
parameter, deep networks can approximate a richer class of functions than shallow …

Deep neural networks and mixed integer linear optimization

M Fischetti, J Jo - Constraints, 2018 - Springer
Abstract Deep Neural Networks (DNNs) are very popular these days, and are the subject of
a very intense investigation. A DNN is made up of layers of internal units (or neurons), each …

Which neural net architectures give rise to exploding and vanishing gradients?

B Hanin - Advances in neural information processing …, 2018 - proceedings.neurips.cc
We give a rigorous analysis of the statistical behavior of gradients in a randomly initialized
fully connected network N with ReLU activations. Our results show that the empirical …

Zen-nas: A zero-shot nas for high-performance image recognition

M Lin, P Wang, Z Sun, H Chen, X Sun… - Proceedings of the …, 2021 - openaccess.thecvf.com
Accuracy predictor is a key component in Neural Architecture Search (NAS) for ranking
architectures. Building a high-quality accuracy predictor usually costs enormous …