- Academic Search

C Sanford, DJ Hsu, M Telgarsky - Advances in Neural …, 2024 - proceedings.neurips.cc

Attention layers, as commonly used in transformers, form the backbone of modern deep
learning, yet there is no mathematical description of their benefits and deficiencies as …

Save Cite Cited by 76 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Hardness of noise-free learning for two-hidden-layer neural networks

S Chen, A Gollakota, A Klivans… - Advances in Neural …, 2022 - proceedings.neurips.cc

We give superpolynomial statistical query (SQ) lower bounds for learning two-hidden-layer
ReLU networks with respect to Gaussian inputs in the standard (noise-free) model. No …

Save Cite Cited by 34 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Improved bounds on neural complexity for representing piecewise linear functions

KL Chen, H Garudadri, BD Rao - Advances in Neural …, 2022 - proceedings.neurips.cc

A deep neural network using rectified linear units represents a continuous piecewise linear
(CPWL) function and vice versa. Recent results in the literature estimated that the number of …

Save Cite Cited by 22 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Towards lower bounds on the depth of ReLU neural networks

C Hertrich, A Basu, M Di Summa… - Advances in Neural …, 2021 - proceedings.neurips.cc

We contribute to a better understanding of the class of functions that is represented by a
neural network with ReLU activations and a given architecture. Using techniques from mixed …

Save Cite Cited by 39 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Optimization-based separations for neural networks

I Safran, J Lee - Conference on Learning Theory, 2022 - proceedings.mlr.press

Depth separation results propose a possible theoretical explanation for the benefits of deep
neural networks over shallower architectures, establishing that the former possess superior …

Save Cite Cited by 17 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Width is less important than depth in relu neural networks

G Vardi, G Yehudai, O Shamir - Conference on learning …, 2022 - proceedings.mlr.press

We solve an open question from Lu et al.(2017), by showing that any target network with
inputs in $\mathbb {R}^ d $ can be approximated by a width $ O (d) $ network (independent …

Save Cite Cited by 17 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

On the optimal memorization power of relu neural networks

G Vardi, G Yehudai, O Shamir - arxiv preprint arxiv:2110.03187, 2021 - arxiv.org

We study the memorization power of feedforward ReLU neural networks. We show that such
networks can memorize any $ N $ points that satisfy a mild separability assumption using …

Save Cite Cited by 30 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

The connection between approximation, depth separation and learnability in neural networks

E Malach, G Yehudai… - … on Learning Theory, 2021 - proceedings.mlr.press

Several recent works have shown separation results between deep neural networks, and
hypothesis classes with inferior approximation capacity such as shallow networks or kernel …

Save Cite Cited by 22 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Exponential separations in symmetric neural networks

A Zweig, J Bruna - Advances in Neural Information …, 2022 - proceedings.neurips.cc

In this work we demonstrate a novel separation between symmetric neural network
architectures. Specifically, we consider the Relational Network~\parencite …

Save Cite Cited by 10 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Size and depth of monotone neural networks: interpolation and approximation

D Mikulincer, D Reichman - Advances in Neural …, 2022 - proceedings.neurips.cc

Monotone functions and data sets arise in a variety of applications. We study the
interpolation problem for monotone data sets: The input is a monotone data set with $ n …

Save Cite Cited by 8 Related articles All 6 versions Free GPT-4 View as HTML

Cite

Advanced search

Saved to My library

Representational strengths and limitations of transformers

Hardness of noise-free learning for two-hidden-layer neural networks

Improved bounds on neural complexity for representing piecewise linear functions

Towards lower bounds on the depth of ReLU neural networks

Optimization-based separations for neural networks

Width is less important than depth in relu neural networks

On the optimal memorization power of relu neural networks

The connection between approximation, depth separation and learnability in neural networks

Exponential separations in symmetric neural networks

Size and depth of monotone neural networks: interpolation and approximation