Inductive biases for deep learning of higher-level cognition

A Goyal, Y Bengio - Proceedings of the Royal Society A, 2022 - royalsocietypublishing.org
A fascinating hypothesis is that human and animal intelligence could be explained by a few
principles (rather than an encyclopaedic list of heuristics). If that hypothesis was correct, we …

On the implicit bias in deep-learning algorithms

G Vardi - Communications of the ACM, 2023 - dl.acm.org
On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …

Balanced multimodal learning via on-the-fly gradient modulation

X Peng, Y Wei, A Deng, D Wang… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Audio-visual learning helps to comprehensively understand the world, by integrating
different senses. Accordingly, multiple input modalities are expected to boost model …

Fishr: Invariant gradient variances for out-of-distribution generalization

A Rame, C Dancette, M Cord - International Conference on …, 2022 - proceedings.mlr.press
Learning robust models that generalize well under changes in the data distribution is critical
for real-world applications. To this end, there has been a growing surge of interest to learn …

Federated learning with buffered asynchronous aggregation

J Nguyen, K Malik, H Zhan… - International …, 2022 - proceedings.mlr.press
Scalability and privacy are two critical concerns for cross-device federated learning (FL)
systems. In this work, we identify that synchronous FL–cannot scale efficiently beyond a few …

Dark experience for general continual learning: a strong, simple baseline

P Buzzega, M Boschini, A Porrello… - Advances in neural …, 2020 - proceedings.neurips.cc
Continual Learning has inspired a plethora of approaches and evaluation settings; however,
the majority of them overlooks the properties of a practical scenario, where the data stream …

Towards efficient and scalable sharpness-aware minimization

Y Liu, S Mai, X Chen, CJ Hsieh… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Abstract Recently, Sharpness-Aware Minimization (SAM), which connects the geometry of
the loss landscape and generalization, has demonstrated a significant performance boost …

Towards theoretically understanding why sgd generalizes better than adam in deep learning

P Zhou, J Feng, C Ma, C **ong… - Advances in Neural …, 2020 - proceedings.neurips.cc
It is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse
generalization performance than SGD despite their faster training speed. This work aims to …

Class-incremental continual learning into the extended der-verse

M Boschini, L Bonicelli, P Buzzega… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
The staple of human intelligence is the capability of acquiring knowledge in a continuous
fashion. In stark contrast, Deep Networks forget catastrophically and, for this reason, the sub …

[HTML][HTML] Quantum natural gradient

J Stokes, J Izaac, N Killoran, G Carleo - Quantum, 2020 - quantum-journal.org
A quantum generalization of Natural Gradient Descent is presented as part of a general-
purpose optimization framework for variational quantum circuits. The optimization dynamics …