To compress or not to compress—self-supervised learning and information theory: A review

R Shwartz Ziv, Y LeCun - Entropy, 2024‏ - mdpi.com
Deep neural networks excel in supervised learning tasks but are constrained by the need for
extensive labeled data. Self-supervised learning emerges as a promising alternative …

Generalization bounds: Perspectives from information theory and PAC-Bayes

F Hellström, G Durisi, B Guedj… - … and Trends® in …, 2025‏ - nowpublishers.com
A fundamental question in theoretical machine learning is generalization. Over the past
decades, the PAC-Bayesian approach has been established as a flexible framework to …

Towards tracing trustworthiness dynamics: Revisiting pre-training period of large language models

C Qian, J Zhang, W Yao, D Liu, Z Yin, Y Qiao… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Ensuring the trustworthiness of large language models (LLMs) is crucial. Most studies
concentrate on fully pre-trained LLMs to better understand and improve LLMs' …

Examining and combating spurious features under distribution shift

C Zhou, X Ma, P Michel… - … Conference on Machine …, 2021‏ - proceedings.mlr.press
A central goal of machine learning is to learn robust representations that capture the
fundamental relationship between inputs and output labels. However, minimizing training …

A measure of the complexity of neural representations based on partial information decomposition

DA Ehrlich, AC Schneider, V Priesemann… - arxiv preprint arxiv …, 2022‏ - arxiv.org
In neural networks, task-relevant information is represented jointly by groups of neurons.
However, the specific way in which this mutual information about the classification label is …

Using sliced mutual information to study memorization and generalization in deep neural networks

S Wongso, R Ghosh, M Motani - International Conference on …, 2023‏ - proceedings.mlr.press
In this paper, we study the memorization and generalization behaviour of deep neural
networks (DNNs) using sliced mutual information (SMI), which is the average of the mutual …

Information flow in deep neural networks

R Shwartz-Ziv - arxiv preprint arxiv:2202.06749, 2022‏ - arxiv.org
Although deep neural networks have been immensely successful, there is no
comprehensive theoretical understanding of how they work or are structured. As a result …

Performance evaluation of deep learning models for image classification over small datasets: Diabetic foot case study

A Hernandez-Guedes, I Santana-Perez… - IEEE …, 2022‏ - ieeexplore.ieee.org
Data scarcity is a common and challenging issue when working with Artificial Intelligence
solutions, especially those including Deep Learning (DL) models for tasks such as image …

Fault detection using generalized autoencoder with neighborhood restriction for electrical drive systems of high-speed trains

S Wang, Y Ju, P **e, C Cheng - Control Engineering Practice, 2024‏ - Elsevier
Over the past two decades, fault detection of high-speed trains has become an active issue
in the transportation area. Recent work has demonstrated the benefits of autoencoder for …

Minimum description length and generalization guarantees for representation learning

M Sefidgaran, A Zaidi… - Advances in Neural …, 2023‏ - proceedings.neurips.cc
A major challenge in designing efficient statistical supervised learning algorithms is finding
representations that perform well not only on available training samples but also on unseen …