Normalization techniques in training dnns: Methodology, analysis and application

L Huang, J Qin, Y Zhou, F Zhu, L Liu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …

Understanding the generalization benefit of normalization layers: Sharpness reduction

K Lyu, Z Li, S Arora - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract Normalization layers (eg, Batch Normalization, Layer Normalization) were
introduced to help with optimization difficulties in very deep nets, but they clearly also help …

Fast mixing of stochastic gradient descent with normalization and weight decay

Z Li, T Wang, D Yu - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Abstract We prove the Fast Equilibrium Conjecture proposed by Li et al.,(2020), ie,
stochastic gradient descent (SGD) on a scale-invariant loss (eg, using networks with various …

RTUNet: Residual transformer UNet specifically for pancreas segmentation

C Qiu, Z Liu, Y Song, J Yin, K Han, Y Zhu, Y Liu… - … Signal Processing and …, 2023 - Elsevier
Accurate pancreas segmentation is crucial for the diagnostic assessment of pancreatic
cancer. However, large position changes, high variability in shape and size, and the …

Re-thinking the effectiveness of batch normalization and beyond

H Peng, Y Yu, S Yu - IEEE Transactions on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Batch normalization (BN) is used by default in many modern deep neural networks due to its
effectiveness in accelerating training convergence and boosting inference performance …

Batch normalization orthogonalizes representations in deep random networks

H Daneshmand, A Joudaki… - Advances in Neural …, 2021 - proceedings.neurips.cc
This paper underlines an elegant property of batch-normalization (BN): Successive batch
normalizations with random linear updates make samples increasingly orthogonal. We …

On the impact of activation and normalization in obtaining isometric embeddings at initialization

A Joudaki, H Daneshmand… - Advances in Neural …, 2023 - proceedings.neurips.cc
In this paper, we explore the structure of the penultimate Gram matrix in deep neural
networks, which contains the pairwise inner products of outputs corresponding to a batch of …

Making batch normalization great in federated deep learning

J Zhong, HY Chen, WL Chao - arxiv preprint arxiv:2303.06530, 2023 - arxiv.org
Batch Normalization (BN) is widely used in {centralized} deep learning to improve
convergence and generalization. However, in {federated} learning (FL) with decentralized …

Multi-scale remaining useful life prediction using long short-term memory

Y Wang, Y Zhao - Sustainability, 2022 - mdpi.com
Predictive maintenance based on performance degradation is a crucial way to reduce
maintenance costs and potential failures in modern complex engineering systems. Reliable …

Towards training without depth limits: Batch normalization without gradient explosion

A Meterez, A Joudaki, F Orabona, A Immer… - arxiv preprint arxiv …, 2023 - arxiv.org
Normalization layers are one of the key building blocks for deep neural networks. Several
theoretical studies have shown that batch normalization improves the signal propagation, by …