Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review

T Poggio, H Mhaskar, L Rosasco, B Miranda… - International Journal of …, 2017 - Springer
The paper reviews and extends an emerging body of theoretical results on deep learning
including the conditions under which it can be exponentially better than shallow learning. A …

Theoretical issues in deep networks

T Poggio, A Banburski, Q Liao - Proceedings of the …, 2020 - National Acad Sciences
While deep learning is successful in a number of applications, it is not yet well understood
theoretically. A theoretical characterization of deep learning should answer questions about …

Deep vs. shallow networks: An approximation theory perspective

HN Mhaskar, T Poggio - Analysis and Applications, 2016 - World Scientific
The paper briefly reviews several recent results on hierarchical architectures for learning
from examples, that may formally explain the conditions under which Deep Convolutional …

When and why are deep networks better than shallow ones?

H Mhaskar, Q Liao, T Poggio - Proceedings of the AAAI conference on …, 2017 - ojs.aaai.org
While the universal approximation property holds both for hierarchical and shallow
networks, deep networks can approximate the class of compositional functions as well as …

Classification with deep neural networks and logistic loss

Z Zhang, L Shi, DX Zhou - Journal of Machine Learning Research, 2024 - jmlr.org
Deep neural networks (DNNs) trained with the logistic loss (also known as the cross entropy
loss) have made impressive advancements in various binary classification tasks. Despite the …

Convolutional rectifier networks as generalized tensor decompositions

N Cohen, A Shashua - International conference on machine …, 2016 - proceedings.mlr.press
Convolutional rectifier networks, ie convolutional neural networks with rectified linear
activation and max or average pooling, are the cornerstone of modern deep learning …

Learning functions: when is deep better than shallow

H Mhaskar, Q Liao, T Poggio - arxiv preprint arxiv:1603.00988, 2016 - arxiv.org
While the universal approximation property holds both for hierarchical and shallow
networks, we prove that deep (hierarchical) networks can approximate the class of …

Inductive bias of deep convolutional networks through pooling geometry

N Cohen, A Shashua - arxiv preprint arxiv:1605.06743, 2016 - arxiv.org
Our formal understanding of the inductive bias that drives the success of convolutional
networks on computer vision tasks is limited. In particular, it is unclear what makes …

A hierarchical predictive coding model of object recognition in natural images

MW Spratling - Cognitive computation, 2017 - Springer
Predictive coding has been proposed as a model of the hierarchical perceptual inference
process performed in the cortex. However, results demonstrating that predictive coding is …

A deep network construction that adapts to intrinsic dimensionality beyond the domain

A Cloninger, T Klock - Neural Networks, 2021 - Elsevier
We study the approximation of two-layer compositions f (x)= g (ϕ (x)) via deep networks with
ReLU activation, where ϕ is a geometrically intuitive, dimensionality reducing feature map …