Inductive biases for deep learning of higher-level cognition
A fascinating hypothesis is that human and animal intelligence could be explained by a few
principles (rather than an encyclopaedic list of heuristics). If that hypothesis was correct, we …
principles (rather than an encyclopaedic list of heuristics). If that hypothesis was correct, we …
On the implicit bias in deep-learning algorithms
G Vardi - Communications of the ACM, 2023 - dl.acm.org
On the Implicit Bias in Deep-Learning Algorithms Page 1 DEEP LEARNING HAS been highly
successful in recent years and has led to dramatic improvements in multiple domains …
successful in recent years and has led to dramatic improvements in multiple domains …
Balanced multimodal learning via on-the-fly gradient modulation
Audio-visual learning helps to comprehensively understand the world, by integrating
different senses. Accordingly, multiple input modalities are expected to boost model …
different senses. Accordingly, multiple input modalities are expected to boost model …
Fishr: Invariant gradient variances for out-of-distribution generalization
Learning robust models that generalize well under changes in the data distribution is critical
for real-world applications. To this end, there has been a growing surge of interest to learn …
for real-world applications. To this end, there has been a growing surge of interest to learn …
Federated learning with buffered asynchronous aggregation
Scalability and privacy are two critical concerns for cross-device federated learning (FL)
systems. In this work, we identify that synchronous FL–cannot scale efficiently beyond a few …
systems. In this work, we identify that synchronous FL–cannot scale efficiently beyond a few …
Dark experience for general continual learning: a strong, simple baseline
Continual Learning has inspired a plethora of approaches and evaluation settings; however,
the majority of them overlooks the properties of a practical scenario, where the data stream …
the majority of them overlooks the properties of a practical scenario, where the data stream …
Towards efficient and scalable sharpness-aware minimization
Abstract Recently, Sharpness-Aware Minimization (SAM), which connects the geometry of
the loss landscape and generalization, has demonstrated a significant performance boost …
the loss landscape and generalization, has demonstrated a significant performance boost …
Towards theoretically understanding why sgd generalizes better than adam in deep learning
It is not clear yet why ADAM-alike adaptive gradient algorithms suffer from worse
generalization performance than SGD despite their faster training speed. This work aims to …
generalization performance than SGD despite their faster training speed. This work aims to …
Class-incremental continual learning into the extended der-verse
The staple of human intelligence is the capability of acquiring knowledge in a continuous
fashion. In stark contrast, Deep Networks forget catastrophically and, for this reason, the sub …
fashion. In stark contrast, Deep Networks forget catastrophically and, for this reason, the sub …
[HTML][HTML] Quantum natural gradient
A quantum generalization of Natural Gradient Descent is presented as part of a general-
purpose optimization framework for variational quantum circuits. The optimization dynamics …
purpose optimization framework for variational quantum circuits. The optimization dynamics …