Data-driven aerospace engineering: reframing the industry with machine learning
Data science, and machine learning in particular, is rapidly transforming the scientific and
industrial landscapes. The aerospace industry is poised to capitalize on big data and …
industrial landscapes. The aerospace industry is poised to capitalize on big data and …
Directional convergence and alignment in deep learning
In this paper, we show that although the minimizers of cross-entropy and related
classification losses are off at infinity, network weights learned by gradient flow converge in …
classification losses are off at infinity, network weights learned by gradient flow converge in …
Learning single-index models with shallow neural networks
Single-index models are a class of functions given by an unknown univariate``link''function
applied to an unknown one-dimensional projection of the input. These models are …
applied to an unknown one-dimensional projection of the input. These models are …
Gradient descent provably optimizes over-parameterized neural networks
One of the mysteries in the success of neural networks is randomly initialized first order
methods like gradient descent can achieve zero training loss even though the objective …
methods like gradient descent can achieve zero training loss even though the objective …
Unbalanced minibatch optimal transport; applications to domain adaptation
Optimal transport distances have found many applications in machine learning for their
capacity to compare non-parametric probability distributions. Yet their algorithmic complexity …
capacity to compare non-parametric probability distributions. Yet their algorithmic complexity …
Gradient descent maximizes the margin of homogeneous neural networks
In this paper, we study the implicit regularization of the gradient descent algorithm in
homogeneous neural networks, including fully-connected and convolutional neural …
homogeneous neural networks, including fully-connected and convolutional neural …
Global optimality guarantees for policy gradient methods
Policy gradients methods apply to complex, poorly understood, control problems by
performing stochastic gradient descent over a parameterized class of polices. Unfortunately …
performing stochastic gradient descent over a parameterized class of polices. Unfortunately …
Gradient descent on two-layer nets: Margin maximization and simplicity bias
The generalization mystery of overparametrized deep nets has motivated efforts to
understand how gradient descent (GD) converges to low-loss solutions that generalize well …
understand how gradient descent (GD) converges to low-loss solutions that generalize well …
Gradient-free methods for deterministic and stochastic nonsmooth nonconvex optimization
Nonsmooth nonconvex optimization problems broadly emerge in machine learning and
business decision making, whereas two core challenges impede the development of …
business decision making, whereas two core challenges impede the development of …
Algorithmic regularization in learning deep homogeneous models: Layers are automatically balanced
We study the implicit regularization imposed by gradient descent for learning multi-layer
homogeneous functions including feed-forward fully connected and convolutional deep …
homogeneous functions including feed-forward fully connected and convolutional deep …