Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives

J Li, J Chen, Y Tang, C Wang, BA Landman… - Medical image …, 2023 - Elsevier
Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …

Neural network approximation

R DeVore, B Hanin, G Petrova - Acta Numerica, 2021 - cambridge.org
Neural networks (NNs) are the method of choice for building learning algorithms. They are
now being investigated for other numerical tasks such as solving high-dimensional partial …

Evolutionary optimization of model merging recipes

T Akiba, M Shing, Y Tang, Q Sun, D Ha - Nature Machine Intelligence, 2025 - nature.com
Large language models (LLMs) have become increasingly capable, but their development
often requires substantial computational resources. Although model merging has emerged …

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

The power of quantum neural networks

A Abbas, D Sutter, C Zoufal, A Lucchi, A Figalli… - Nature Computational …, 2021 - nature.com
It is unknown whether near-term quantum computers are advantageous for machine
learning tasks. In this work we address this question by trying to understand how powerful …

Sharpness-aware gradient matching for domain generalization

P Wang, Z Zhang, Z Lei… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The goal of domain generalization (DG) is to enhance the generalization capability of the
model learned from a source domain to other unseen domains. The recently developed …

User-friendly introduction to PAC-Bayes bounds

P Alquier - Foundations and Trends® in Machine Learning, 2024 - nowpublishers.com
Aggregated predictors are obtained by making a set of basic predictors vote according to
some weights, that is, to some probability distribution. Randomized predictors are obtained …

Swad: Domain generalization by seeking flat minima

J Cha, S Chun, K Lee, HC Cho… - Advances in Neural …, 2021 - proceedings.neurips.cc
Abstract Domain generalization (DG) methods aim to achieve generalizability to an unseen
target domain by using only training data from the source domains. Although a variety of DG …

Bayesian deep learning and a probabilistic perspective of generalization

AG Wilson, P Izmailov - Advances in neural information …, 2020 - proceedings.neurips.cc
The key distinguishing property of a Bayesian approach is marginalization, rather than using
a single setting of weights. Bayesian marginalization can particularly improve the accuracy …

Towards efficient and scalable sharpness-aware minimization

Y Liu, S Mai, X Chen, CJ Hsieh… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Abstract Recently, Sharpness-Aware Minimization (SAM), which connects the geometry of
the loss landscape and generalization, has demonstrated a significant performance boost …