Learning single-index models with shallow neural networks
Single-index models are a class of functions given by an unknown univariate``link''function
applied to an unknown one-dimensional projection of the input. These models are …
applied to an unknown one-dimensional projection of the input. These models are …
Statistically meaningful approximation: a case study on approximating turing machines with transformers
A common lens to theoretically study neural net architectures is to analyze the functions they
can approximate. However, the constructions from approximation theory often have …
can approximate. However, the constructions from approximation theory often have …
Provable guarantees for nonlinear feature learning in three-layer neural networks
One of the central questions in the theory of deep learning is to understand how neural
networks learn hierarchical features. The ability of deep networks to extract salient features …
networks learn hierarchical features. The ability of deep networks to extract salient features …
Optimizing solution-samplers for combinatorial problems: The landscape of policy-gradient method
Abstract Deep Neural Networks and Reinforcement Learning methods have empirically
shown great promise in tackling challenging combinatorial problems. In those methods a …
shown great promise in tackling challenging combinatorial problems. In those methods a …
Designing Universally-Approximating Deep Neural Networks: A First-Order Optimization Approach
Universal approximation capability, also referred to as universality, is an important property
of deep neural networks, endowing them with the potency to accurately represent the …
of deep neural networks, endowing them with the potency to accurately represent the …
Optimization-based separations for neural networks
Depth separation results propose a possible theoretical explanation for the benefits of deep
neural networks over shallower architectures, establishing that the former possess superior …
neural networks over shallower architectures, establishing that the former possess superior …
Width is less important than depth in relu neural networks
We solve an open question from Lu et al.(2017), by showing that any target network with
inputs in $\mathbb {R}^ d $ can be approximated by a width $ O (d) $ network (independent …
inputs in $\mathbb {R}^ d $ can be approximated by a width $ O (d) $ network (independent …
Size and depth separation in approximating benign functions with neural networks
When studying the expressive power of neural networks, a main challenge is to understand
how the size and depth of the network affect its ability to approximate real functions …
how the size and depth of the network affect its ability to approximate real functions …
Depth separation beyond radial functions
High-dimensional depth separation results for neural networks show that certain functions
can be efficiently approximated by two-hidden-layer networks but not by one-hidden-layer …
can be efficiently approximated by two-hidden-layer networks but not by one-hidden-layer …
Regret guarantees for online deep control
Despite the immense success of deep learning in reinforcement learning and control, few
theoretical guarantees for neural networks exist for these problems. Deriving performance …
theoretical guarantees for neural networks exist for these problems. Deriving performance …