Understanding deep representation learning via layerwise feature compression and discrimination

P Wang, X Li, C Yaras, Z Zhu, L Balzano, W Hu… - arxiv preprint arxiv …, 2023 - arxiv.org
Over the past decade, deep learning has proven to be a highly effective tool for learning
meaningful features from raw data. However, it remains an open question how deep …

BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference

C Lee, SM Kwon, Q Qu, HS Kim - Advances in Neural …, 2025 - proceedings.neurips.cc
Large-scale foundation models have demonstrated exceptional performance in language
and vision tasks. However, the numerous dense matrix-vector operations involved in these …

Sharpness-Aware Lookahead for Accelerating Convergence and Improving Generalization

C Tan, J Zhang, J Liu, Y Gong - IEEE Transactions on Pattern …, 2024 - ieeexplore.ieee.org
Lookahead is a popular stochastic optimizer that can accelerate the training process of deep
neural networks. However, the solutions found by Lookahead often generalize worse than …

Neural collapse in multi-label learning with pick-all-label loss

P Li, X Li, Y Wang, Q Qu - arxiv preprint arxiv:2310.15903, 2023 - arxiv.org
We study deep neural networks for the multi-label classification (MLab) task through the lens
of neural collapse (NC). Previous works have been restricted to the multi-class classification …

Embedded prompt tuning: Towards enhanced calibration of pretrained models for medical images

W Zu, S **e, Q Zhao, G Li, L Ma - Medical Image Analysis, 2024 - Elsevier
Foundation models pre-trained on large-scale data have been widely witnessed to achieve
success in various natural imaging downstream tasks. Parameter-efficient fine-tuning (PEFT) …

Approaching deep learning through the spectral dynamics of weights

D Yunis, KK Patel, S Wheeler, P Savarese… - arxiv preprint arxiv …, 2024 - arxiv.org
We propose an empirical approach centered on the spectral dynamics of weights--the
behavior of singular values and vectors during optimization--to unify and clarify several …

A spring-block theory of feature learning in deep neural networks

C Shi, L Pan, I Dokmanić - arxiv preprint arxiv:2407.19353, 2024 - arxiv.org
Feature-learning deep nets progressively collapse data to a regular low-dimensional
geometry. How this phenomenon emerges from collective action of nonlinearity, noise …

Differentiable learning of generalized structured matrices for efficient deep neural networks

C Lee, HS Kim - arxiv preprint arxiv:2310.18882, 2023 - arxiv.org
This paper investigates efficient deep neural networks (DNNs) to replace dense
unstructured weight matrices with structured ones that possess desired properties. The …

On subdifferential chain rule of matrix factorization and beyond

J Guan, AMC So - arxiv preprint arxiv:2410.05022, 2024 - arxiv.org
In this paper, we study equality-type Clarke subdifferential chain rules of matrix factorization
and factorization machine. Specifically, we show for these problems that provided the latent …

Efficient compression of overparameterized deep models through low-dimensional learning dynamics

SM Kwon, Z Zhang, D Song, L Balzano… - arxiv preprint arxiv …, 2023 - arxiv.org
Overparameterized models have proven to be powerful tools for solving various machine
learning tasks. However, overparameterization often leads to a substantial increase in …