Normalization techniques in training dnns: Methodology, analysis and application
Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …
generalization of deep neural networks (DNNs), and have successfully been used in various …
Understanding the generalization benefit of normalization layers: Sharpness reduction
Abstract Normalization layers (eg, Batch Normalization, Layer Normalization) were
introduced to help with optimization difficulties in very deep nets, but they clearly also help …
introduced to help with optimization difficulties in very deep nets, but they clearly also help …
Fast mixing of stochastic gradient descent with normalization and weight decay
Abstract We prove the Fast Equilibrium Conjecture proposed by Li et al.,(2020), ie,
stochastic gradient descent (SGD) on a scale-invariant loss (eg, using networks with various …
stochastic gradient descent (SGD) on a scale-invariant loss (eg, using networks with various …
RTUNet: Residual transformer UNet specifically for pancreas segmentation
Accurate pancreas segmentation is crucial for the diagnostic assessment of pancreatic
cancer. However, large position changes, high variability in shape and size, and the …
cancer. However, large position changes, high variability in shape and size, and the …
Re-thinking the effectiveness of batch normalization and beyond
Batch normalization (BN) is used by default in many modern deep neural networks due to its
effectiveness in accelerating training convergence and boosting inference performance …
effectiveness in accelerating training convergence and boosting inference performance …
Batch normalization orthogonalizes representations in deep random networks
This paper underlines an elegant property of batch-normalization (BN): Successive batch
normalizations with random linear updates make samples increasingly orthogonal. We …
normalizations with random linear updates make samples increasingly orthogonal. We …
On the impact of activation and normalization in obtaining isometric embeddings at initialization
In this paper, we explore the structure of the penultimate Gram matrix in deep neural
networks, which contains the pairwise inner products of outputs corresponding to a batch of …
networks, which contains the pairwise inner products of outputs corresponding to a batch of …
Making batch normalization great in federated deep learning
Batch Normalization (BN) is widely used in {centralized} deep learning to improve
convergence and generalization. However, in {federated} learning (FL) with decentralized …
convergence and generalization. However, in {federated} learning (FL) with decentralized …
Multi-scale remaining useful life prediction using long short-term memory
Y Wang, Y Zhao - Sustainability, 2022 - mdpi.com
Predictive maintenance based on performance degradation is a crucial way to reduce
maintenance costs and potential failures in modern complex engineering systems. Reliable …
maintenance costs and potential failures in modern complex engineering systems. Reliable …
Towards training without depth limits: Batch normalization without gradient explosion
Normalization layers are one of the key building blocks for deep neural networks. Several
theoretical studies have shown that batch normalization improves the signal propagation, by …
theoretical studies have shown that batch normalization improves the signal propagation, by …