Revisiting knowledge distillation: An inheritance and exploration framework

Z Huang, X Shen, J **ng, T Liu, X Tian… - Proceedings of the …, 2021 - openaccess.thecvf.com
Abstract Knowledge Distillation (KD) is a popular technique to transfer knowledge from a
teacher model or ensemble to a student model. Its success is generally attributed to the …

Model-independent detection of new physics signals using interpretable semisupervised classifier tests

P Chakravarti, M Kuusela, J Lei… - The Annals of Applied …, 2023 - projecteuclid.org
The supplementary material contains the proof of Theorem 4.1, some of the proposed
algorithms from Section 3.2, and details about the exploratory data analysis of the Higgs …

Robust tensor decomposition via orientation invariant tubal nuclear norms

A Wang, QB Zhao, Z **, C Li, GX Zhou - Science China Technological …, 2022 - Springer
Aiming at recovering an unknown tensor (ie, multi-way array) corrupted by both sparse
outliers and dense noises, robust tensor decomposition (RTD) serves as a powerful pre …

On the deep active-subspace method

W Edeling - SIAM/ASA Journal on Uncertainty Quantification, 2023 - SIAM
The deep active-subspace method is a neural-network based tool for the propagation of
uncertainty through computational models with high-dimensional input spaces. Unlike the …

A supervised learning approach involving active subspaces for an efficient genetic algorithm in high-dimensional optimization problems

N Demo, M Tezzele, G Rozza - SIAM Journal on Scientific Computing, 2021 - SIAM
In this work, we present an extension of genetic algorithm (GA) which exploits the
supervised learning technique called active subspaces (AS) to evolve the individuals on a …

Low-rank gradient descent

R Cosson, A Jadbabaie, A Makur… - IEEE Open Journal …, 2023 - ieeexplore.ieee.org
Several recent empirical studies demonstrate that important machine learning tasks such as
training deep neural networks, exhibit a low-rank structure, where most of the variation in the …

Learning the subspace of variation for global optimization of functions with low effective dimension

C Cartis, X Liang, E Massart, A Otemissov - arxiv preprint arxiv …, 2024 - arxiv.org
We propose an algorithmic framework, that employs active subspace techniques, for
scalable global optimization of functions with low effective dimension (also referred to as low …

Learning active subspaces for effective and scalable uncertainty quantification in deep neural networks

S Jantre, NM Urban, X Qian… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Bayesian inference for neural networks, or Bayesian deep learning, has the potential to
provide well-calibrated predictions with quantified uncertainty and robustness. However, the …

A dimensionality reduction approach for convolutional neural networks

L Meneghetti, N Demo, G Rozza - Applied Intelligence, 2023 - Springer
The focus of this work is on the application of classical Model Order Reduction techniques,
such as Active Subspaces and Proper Orthogonal Decomposition, to Deep Neural …

Few-bit backward: Quantized gradients of activation functions for memory footprint reduction

GS Novikov, D Bershatsky, J Gusak… - International …, 2023 - proceedings.mlr.press
Memory footprint is one of the main limiting factors for large neural network training. In
backpropagation, one needs to store the input to each operation in the computational graph …