Shufflenet v2: Practical guidelines for efficient cnn architecture design
Current network architecture design is mostly guided by the indirect metric of computation
complexity, ie, FLOPs. However, the direct metric, such as speed, also depends on the other …
complexity, ie, FLOPs. However, the direct metric, such as speed, also depends on the other …
Deep learning in electron microscopy
JM Ede - Machine Learning: Science and Technology, 2021 - iopscience.iop.org
Deep learning is transforming most areas of science and technology, including electron
microscopy. This review paper offers a practical perspective aimed at developers with …
microscopy. This review paper offers a practical perspective aimed at developers with …
Revisiting small batch training for deep neural networks
D Masters, C Luschi - arxiv preprint arxiv:1804.07612, 2018 - arxiv.org
Modern deep neural network training is typically based on mini-batch stochastic gradient
optimization. While the use of large mini-batches increases the available computational …
optimization. While the use of large mini-batches increases the available computational …
Train longer, generalize better: closing the generalization gap in large batch training of neural networks
Background: Deep learning models are typically trained using stochastic gradient descent or
one of its variants. These methods update the weights using their gradient, estimated from a …
one of its variants. These methods update the weights using their gradient, estimated from a …
On large-batch training for deep learning: Generalization gap and sharp minima
The stochastic gradient descent (SGD) method and its variants are algorithms of choice for
many Deep Learning tasks. These methods operate in a small-batch regime wherein a …
many Deep Learning tasks. These methods operate in a small-batch regime wherein a …
Deepchain: Auditable and privacy-preserving deep learning with blockchain-based incentive
Deep learning can achieve higher accuracy than traditional machine learning algorithms in
a variety of machine learning tasks. Recently, privacy-preserving deep learning has drawn …
a variety of machine learning tasks. Recently, privacy-preserving deep learning has drawn …
Revisiting distributed synchronous SGD
Distributed training of deep learning models on large-scale training data is typically
conducted with asynchronous stochastic optimization to maximize the rate of updates, at the …
conducted with asynchronous stochastic optimization to maximize the rate of updates, at the …
Imagenet training in minutes
In this paper, we investigate large scale computers' capability of speeding up deep neural
networks (DNN) training. Our approach is to use large batch size, powered by the Layer …
networks (DNN) training. Our approach is to use large batch size, powered by the Layer …
Predicting disruptive instabilities in controlled fusion plasmas through deep learning
Nuclear fusion power delivered by magnetic-confinement tokamak reactors holds the
promise of sustainable and clean energy. The avoidance of large-scale plasma instabilities …
promise of sustainable and clean energy. The avoidance of large-scale plasma instabilities …
An exhaustive survey on p4 programmable data plane switches: Taxonomy, applications, challenges, and future trends
Traditionally, the data plane has been designed with fixed functions to forward packets using
a small set of protocols. This closed-design paradigm has limited the capability of the …
a small set of protocols. This closed-design paradigm has limited the capability of the …