Learning with norm constrained, over-parameterized, two-layer neural networks

F Liu, L Dadi, V Cevher - Journal of Machine Learning Research, 2024 - jmlr.org
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space
to model functions by neural networks as the curse of dimensionality (CoD) cannot be …

Unraveling attention via convex duality: Analysis and interpretations of vision transformers

A Sahiner, T Ergen, B Ozturkler… - International …, 2022 - proceedings.mlr.press
Vision transformers using self-attention or its proposed alternatives have demonstrated
promising results in many image related tasks. However, the underpinning inductive bias of …

Cronos: Enhancing deep learning with scalable gpu accelerated convex neural networks

M Feng, Z Frangella, M Pilanci - Advances in Neural …, 2025 - proceedings.neurips.cc
We introduce the CRONOS algorithm for convex optimization of two-layer neural networks.
CRONOS is the first algorithm capable of scaling to high-dimensional datasets such as …

Optimal sets and solution paths of relu networks

A Mishkin, M Pilanci - International Conference on Machine …, 2023 - proceedings.mlr.press
We develop an analytical framework to characterize the set of optimal ReLU neural networks
by reformulating the non-convex training problem as a convex program. We show that the …

Efficient global optimization of two-layer relu networks: Quadratic-time algorithms and adversarial training

Y Bai, T Gautam, S Sojoudi - SIAM Journal on Mathematics of Data Science, 2023 - SIAM
The nonconvexity of the artificial neural network (ANN) training landscape brings
optimization difficulties. While the traditional back-propagation stochastic gradient descent …

Variation spaces for multi-output neural networks: Insights on multi-task learning and network compression

J Shenouda, R Parhi, K Lee, RD Nowak - Journal of Machine Learning …, 2024 - jmlr.org
This paper introduces a novel theoretical framework for the analysis of vector-valued neural
networks through the development of vector-valued variation spaces, a new class of …

Convex relaxations of relu neural networks approximate global optima in polynomial time

S Kim, M Pilanci - arxiv preprint arxiv:2402.03625, 2024 - arxiv.org
In this paper, we study the optimality gap between two-layer ReLU networks regularized with
weight decay and their convex relaxations. We show that when the training data is random …

The real tropical geometry of neural networks

MC Brandenburg, G Loho, G Montúfar - arxiv preprint arxiv:2403.11871, 2024 - arxiv.org
We consider a binary classifier defined as the sign of a tropical rational function, that is, as
the difference of two convex piecewise linear functions. The parameter space of ReLU …

Fuzzy Adaptive Knowledge-Based Inference Neural Networks: Design and Analysis

S Liu, SK Oh, W Pedrycz, B Yang… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
A novel fuzzy adaptive knowledge-based inference neural network (FAKINN) is proposed in
this study. Conventional fuzzy cluster-based neural networks (FCBNNs) suffer from the …

Why line search when you can plane search? so-friendly neural networks allow per-iteration optimization of learning and momentum rates for every layer

B Shea, M Schmidt - arxiv preprint arxiv:2406.17954, 2024 - arxiv.org
We introduce the class of SO-friendly neural networks, which include several models used in
practice including networks with 2 layers of hidden weights where the number of inputs is …