Efficient acceleration of deep learning inference on resource-constrained edge devices: A review
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …
in breakthroughs in many areas. However, deploying these highly accurate models for data …
Transformers in speech processing: A survey
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …
sparked the interest of the speech-processing community, leading to an exploration of their …
Annealing knowledge distillation
Significant memory and computational requirements of large deep neural networks restrict
their application on edge devices. Knowledge distillation (KD) is a prominent model …
their application on edge devices. Knowledge distillation (KD) is a prominent model …
One-shot model for mixed-precision quantization
Neural network quantization is a popular approach for model compression. Modern
hardware supports quantization in mixed-precision mode, which allows for greater …
hardware supports quantization in mixed-precision mode, which allows for greater …
Autofreeze: Automatically freezing model blocks to accelerate fine-tuning
With the rapid adoption of machine learning (ML), a number of domains now use the
approach of fine tuning models which were pre-trained on a large corpus of data. However …
approach of fine tuning models which were pre-trained on a large corpus of data. However …
4-bit conformer with native quantization aware training for speech recognition
Reducing the latency and model size has always been a significant research problem for
live Automatic Speech Recognition (ASR) application scenarios. Along this direction, model …
live Automatic Speech Recognition (ASR) application scenarios. Along this direction, model …
USM-Lite: Quantization and Sparsity Aware Fine-Tuning for Speech Recognition with Universal Speech Models
End-to-end automatic speech recognition (ASR) models have seen revolutionary quality
gains with the recent development of large-scale universal speech models (USM). However …
gains with the recent development of large-scale universal speech models (USM). However …
Integer-only zero-shot quantization for efficient speech recognition
End-to-end neural network models achieve improved performance on various automatic
speech recognition (ASR) tasks. However, these models perform poorly on edge hardware …
speech recognition (ASR) tasks. However, these models perform poorly on edge hardware …
Multi-distribution noise quantisation: an extreme compression scheme for transformer according to parameter distribution
With the development of deep learning, neural networks are widely used in various fields,
and the improved model performance also introduces a considerable number of parameters …
and the improved model performance also introduces a considerable number of parameters …
Transformer-based Arabic dialect identification
This paper presents a dialect identification (DID) system based on the transformer neural
network architecture. The conventional convolutional neural network (CNN)-based systems …
network architecture. The conventional convolutional neural network (CNN)-based systems …