Model compression and hardware acceleration for neural networks: A comprehensive survey
Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
slow down for general-purpose processors due to the foreseeable end of Moore's Law …
Normalization techniques in training dnns: Methodology, analysis and application
Normalization techniques are essential for accelerating the training and improving the
generalization of deep neural networks (DNNs), and have successfully been used in various …
generalization of deep neural networks (DNNs), and have successfully been used in various …
A survey of quantization methods for efficient neural network inference
This chapter provides approaches to the problem of quantizing the numerical values in deep
Neural Network computations, covering the advantages/disadvantages of current methods …
Neural Network computations, covering the advantages/disadvantages of current methods …
Pruning and quantization for deep neural network acceleration: A survey
Deep neural networks have been applied in many applications exhibiting extraordinary
abilities in the field of computer vision. However, complex network architectures challenge …
abilities in the field of computer vision. However, complex network architectures challenge …
Nerv: Neural representations for videos
We propose a novel neural representation for videos (NeRV) which encodes videos in
neural networks. Unlike conventional representations that treat videos as frame sequences …
neural networks. Unlike conventional representations that treat videos as frame sequences …
{BatchCrypt}: Efficient homomorphic encryption for {Cross-Silo} federated learning
Cross-silo federated learning (FL) enables organizations (eg, financial, or medical) to
collaboratively train a machine learning model by aggregating local gradient updates from …
collaboratively train a machine learning model by aggregating local gradient updates from …
Mlperf training benchmark
Abstract Machine learning is experiencing an explosion of software and hardware solutions,
and needs industry-standard performance benchmarks to drive design and enable …
and needs industry-standard performance benchmarks to drive design and enable …
Ultra-low precision 4-bit training of deep neural networks
In this paper, we propose a number of novel techniques and numerical representation
formats that enable, for the very first time, the precision of training systems to be aggressively …
formats that enable, for the very first time, the precision of training systems to be aggressively …
Benchmarking TPU, GPU, and CPU platforms for deep learning
Training deep learning models is compute-intensive and there is an industry-wide trend
towards hardware specialization to improve performance. To systematically benchmark …
towards hardware specialization to improve performance. To systematically benchmark …
Photonic multiply-accumulate operations for neural networks
It has long been known that photonic communication can alleviate the data movement
bottlenecks that plague conventional microelectronic processors. More recently, there has …
bottlenecks that plague conventional microelectronic processors. More recently, there has …