Normalization techniques in training dnns: Methodology, analysis and application
Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …
generalization of deep neural networks (DNNs), and have successfully been used in various …
Sparse invariant risk minimization
Abstract Invariant Risk Minimization (IRM) is an emerging invariant feature extracting
technique to help generalization with distributional shift. However, we find that there exists a …
technique to help generalization with distributional shift. However, we find that there exists a …
Learning to optimize domain specific normalization for domain generalization
We propose a simple but effective multi-source domain generalization technique based on
deep neural networks by incorporating optimized normalization layers that are specific to …
deep neural networks by incorporating optimized normalization layers that are specific to …
Adaptively sparse transformers
Attention mechanisms have become ubiquitous in NLP. Recent architectures, notably the
Transformer, learn powerful context-aware word representations through layered, multi …
Transformer, learn powerful context-aware word representations through layered, multi …
Adversarially adaptive normalization for single domain generalization
Single domain generalization aims to learn a model that performs well on many unseen
domains with only one domain data for training. Existing works focus on studying the …
domains with only one domain data for training. Existing works focus on studying the …
A reversible automatic selection normalization (RASN) deep network for predicting in the smart agriculture system
Due to the nonlinear modeling capabilities, deep learning prediction networks have become
widely used for smart agriculture. Because the sensing data has noise and complex …
widely used for smart agriculture. Because the sensing data has noise and complex …
Towards understanding regularization in batch normalization
Batch Normalization (BN) improves both convergence and generalization in training neural
networks. This work understands these phenomena theoretically. We analyze BN by using a …
networks. This work understands these phenomena theoretically. We analyze BN by using a …
Differentiable learning-to-normalize via switchable normalization
We address a learning-to-normalize problem by proposing Switchable Normalization (SN),
which learns to select different normalizers for different normalization layers of a deep neural …
which learns to select different normalizers for different normalization layers of a deep neural …
Micro-batch training with batch-channel normalization and weight standardization
Batch Normalization (BN) has become an out-of-box technique to improve deep network
training. However, its effectiveness is limited for micro-batch training, ie, each GPU typically …
training. However, its effectiveness is limited for micro-batch training, ie, each GPU typically …
Cross-iteration batch normalization
A well-known issue of Batch Normalization is its significantly reduced effectiveness in the
case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon …
case of small mini-batch sizes. When a mini-batch contains few examples, the statistics upon …