- Academic Search

E Malach, G Yehudai… - International …, 2020 - proceedings.mlr.press

The lottery ticket hypothesis (Frankle and Carbin, 2018), states that a randomly-initialized
network contains a small subnetwork such that, when trained in isolation, can compete with …

Save Cite Cited by 316 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Statistical-query lower bounds via functional gradients

S Goel, A Gollakota, A Klivans - Advances in Neural …, 2020 - proceedings.neurips.cc

We give the first statistical-query lower bounds for agnostically learning any non-polynomial
activation with respect to Gaussian marginals (eg, ReLU, sigmoid, sign). For the specific …

Save Cite Cited by 67 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] sciencedirect.com

A physics-informed multi-agents model to predict thermo-oxidative/hydrolytic aging of elastomers

A Ghaderi, V Morovati, Y Chen, R Dargazany - International Journal of …, 2022 - Elsevier

This paper introduces a novel physics-informed multi-agents constitutive model to propose
prediction in quasi-static constitutive behavior of cross-linked elastomer and the loss of …

Save Cite Cited by 26 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] nsf.gov

Memory capacity of neural networks with threshold and rectified linear unit activations

R Vershynin - SIAM Journal on Mathematics of Data Science, 2020 - SIAM

Overwhelming theoretical and empirical evidence shows that mildly overparametrized
neural networks---those with more connections than the size of the training data---are often …

Save Cite Cited by 59 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

AESPA: Accuracy preserving low-degree polynomial activation for fast private inference

J Park, MJ Kim, W Jung, JH Ahn - arxiv preprint arxiv:2201.06699, 2022 - arxiv.org

Hybrid private inference (PI) protocol, which synergistically utilizes both multi-party
computation (MPC) and homomorphic encryption, is one of the most prominent techniques …

Save Cite Cited by 29 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

A modular analysis of provable acceleration via polyak's momentum: Training a wide relu network and a deep linear network

JK Wang, CH Lin, JD Abernethy - … Conference on Machine …, 2021 - proceedings.mlr.press

Incorporating a so-called “momentum” dynamic in gradient descent methods is widely used
in neural net training as it has been broadly observed that, at least empirically, it often leads …

Save Cite Cited by 29 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Non-asymptotic approximations of neural networks by Gaussian processes

R Eldan, D Mikulincer… - Conference on Learning …, 2021 - proceedings.mlr.press

We study the extent to which wide neural networks may be approximated by Gaussian
processes, when initialized with random weights. It is a well-established fact that as the …

Save Cite Cited by 25 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Digraf: Diffeomorphic graph-adaptive activation function

KSI Mantri, X Wang, CB Schönlieb, B Ribeiro… - arxiv preprint arxiv …, 2024 - arxiv.org

In this paper, we propose a novel activation function tailored specifically for graph data in
Graph Neural Networks (GNNs). Motivated by the need for graph-adaptive and flexible …

Save Cite Cited by 3 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mdpi.com

Effects of nonlinearity and network architecture on the performance of supervised neural networks

N Kulathunga, NR Ranasinghe, D Vrinceanu… - Algorithms, 2021 - mdpi.com

The nonlinearity of activation functions used in deep learning models is crucial for the
success of predictive models. Several simple nonlinear functions, including Rectified Linear …

Save Cite Cited by 31 Related articles All 7 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] arxiv.org

Characterizing the spectrum of the NTK via a power series expansion

M Murray, H **, B Bowman, G Montufar - arxiv preprint arxiv:2211.07844, 2022 - arxiv.org

Under mild conditions on the network initialization we derive a power series expansion for
the Neural Tangent Kernel (NTK) of arbitrarily deep feedforward networks in the infinite …

Save Cite Cited by 11 Related articles All 4 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Effect of activation functions on the training of overparametrized neural nets

Proving the lottery ticket hypothesis: Pruning is all you need

Statistical-query lower bounds via functional gradients

A physics-informed multi-agents model to predict thermo-oxidative/hydrolytic aging of elastomers

Memory capacity of neural networks with threshold and rectified linear unit activations

AESPA: Accuracy preserving low-degree polynomial activation for fast private inference

A modular analysis of provable acceleration via polyak's momentum: Training a wide relu network and a deep linear network

Non-asymptotic approximations of neural networks by Gaussian processes

Digraf: Diffeomorphic graph-adaptive activation function

Effects of nonlinearity and network architecture on the performance of supervised neural networks

Characterizing the spectrum of the NTK via a power series expansion