A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?
As ChatGPT goes viral, generative AI (AIGC, aka AI-generated content) has made headlines
everywhere because of its ability to analyze and create text, images, and beyond. With such …
everywhere because of its ability to analyze and create text, images, and beyond. With such …
[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Machine learning: Algorithms, real-world applications and research directions
IH Sarker - SN computer science, 2021 - Springer
In the current age of the Fourth Industrial Revolution (4 IR or Industry 4.0), the digital world
has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data …
has a wealth of data, such as Internet of Things (IoT) data, cybersecurity data, mobile data …
[HTML][HTML] Recurrent neural networks: A comprehensive review of architectures, variants, and applications
Recurrent neural networks (RNNs) have significantly advanced the field of machine learning
(ML) by enabling the effective processing of sequential data. This paper provides a …
(ML) by enabling the effective processing of sequential data. This paper provides a …
A survey on neural speech synthesis
Text to speech (TTS), or speech synthesis, which aims to synthesize intelligible and natural
speech given text, is a hot research topic in speech, language, and machine learning …
speech given text, is a hot research topic in speech, language, and machine learning …
Viola: Conditional language models for speech recognition, synthesis, and translation
Recent research shows a big convergence in model architecture, training objectives, and
inference methods across various tasks for different modalities. In this paper, we propose …
inference methods across various tasks for different modalities. In this paper, we propose …
Conformer: Convolution-augmented transformer for speech recognition
Recently Transformer and Convolution neural network (CNN) based models have shown
promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural …
promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural …
Gshard: Scaling giant models with conditional computation and automatic sharding
Neural network scaling has been critical for improving the model quality in many real-world
machine learning applications with vast amounts of training data and compute. Although this …
machine learning applications with vast amounts of training data and compute. Although this …
End-to-end speech recognition: A survey
In the last decade of automatic speech recognition (ASR) research, the introduction of deep
learning has brought considerable reductions in word error rate of more than 50% relative …
learning has brought considerable reductions in word error rate of more than 50% relative …
Tabnet: Attentive interpretable tabular learning
We propose a novel high-performance and interpretable canonical deep tabular data
learning architecture, TabNet. TabNet uses sequential attention to choose which features to …
learning architecture, TabNet. TabNet uses sequential attention to choose which features to …