[HTML][HTML] On the convergence and calibration of deep learning with differential privacy
Differentially private (DP) training preserves the data privacy usually at the cost of slower
convergence (and thus lower accuracy), as well as more severe mis-calibration than its non …
convergence (and thus lower accuracy), as well as more severe mis-calibration than its non …
MetaNODE: Prototype optimization as a neural ODE for few-shot learning
Abstract Few-Shot Learning (FSL) is a challenging task, ie, how to recognize novel classes
with few examples? Pre-training based methods effectively tackle the problem by pre …
with few examples? Pre-training based methods effectively tackle the problem by pre …
Sparse neural additive model: Interpretable deep learning with feature selection via group sparsity
Interpretable machine learning has demonstrated impressive performance while preserving
explainability. In particular, neural additive models (NAM) offer the interpretability to the …
explainability. In particular, neural additive models (NAM) offer the interpretability to the …
Node-adapter: Neural ordinary differential equations for better vision-language reasoning
In this paper, we consider the problem of prototype-based vision-language reasoning
problem. We observe that existing methods encounter three major challenges: 1) escalating …
problem. We observe that existing methods encounter three major challenges: 1) escalating …
Provable convergence of nesterov's accelerated gradient method for over-parameterized neural networks
X Liu, Z Pan, W Tao - Knowledge-Based Systems, 2022 - Elsevier
Momentum methods, such as heavy ball method (HB) and Nesterov's accelerated gradient
method (NAG), have been widely used in training neural networks by incorporating the …
method (NAG), have been widely used in training neural networks by incorporating the …
A convergence analysis of Nesterov's accelerated gradient method in training deep linear neural networks
X Liu, W Tao, Z Pan - Information Sciences, 2022 - Elsevier
As the training process of deep neural networks involves expensive computational cost,
speeding up the convergence is of great importance. Nesterov's accelerated gradient (NAG) …
speeding up the convergence is of great importance. Nesterov's accelerated gradient (NAG) …
Covariate-balancing-aware interpretable deep learning models for treatment effect estimation
Estimating treatment effects is of great importance for many biomedical applications with
observational data. Particularly, interpretability of the treatment effects is preferable for many …
observational data. Particularly, interpretability of the treatment effects is preferable for many …
A high-resolution dynamical view on momentum methods for over-parameterized neural networks
X Liu, W Tao, J Wang, Z Pan - arxiv preprint arxiv:2208.03941, 2022 - arxiv.org
Due to the simplicity and efficiency of the first-order gradient method, it has been widely
used in training neural networks. Although the optimization problem of the neural network is …
used in training neural networks. Although the optimization problem of the neural network is …
[PDF][PDF] Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
X Liu, W Tao, W Li, D Zhan, J Wang, Z Pan - Proceedings of the Thirty-Third …, 2024 - ijcai.org
Due to its simplicity and efficiency, the first-order gradient method has been extensively
employed in training neural networks. Although the optimization problem of the neural …
employed in training neural networks. Although the optimization problem of the neural …
[HTML][HTML] Accelerated analysis on the triple momentum method for a two-layer ReLU neural network
X Li, X Liu - Journal of King Saud University-Computer and …, 2024 - Elsevier
The momentum method has become the workhorse in the deep learning community. To
theoretical understand its success, researchers put efforts in demystifying its convergence …
theoretical understand its success, researchers put efforts in demystifying its convergence …