A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?

C Zhang, C Zhang, S Zheng, Y Qiao, C Li… - ar** real-time streaming transformer transducer for speech recognition on large-scale dataset
X Chen, Y Wu, Z Wang, S Liu… - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Recently, Transformer based end-to-end models have achieved great success in many
areas including speech recognition. However, compared to LSTM models, the heavy …

Survey of deep representation learning for speech emotion recognition

S Latif, R Rana, S Khalifa, R Jurdak… - IEEE Transactions …, 2021 - ieeexplore.ieee.org
Traditionally, speech emotion recognition (SER) research has relied on manually
handcrafted acoustic features using feature engineering. However, the design of …

Transformers in speech processing: A survey

S Latif, A Zaidi, H Cuayahuitl, F Shamshad… - ar** RNN-T models surpassing high-performance hybrid models with customization capability
J Li, R Zhao, Z Meng, Y Liu, W Wei… - arxiv preprint arxiv …, 2020 - arxiv.org
Because of its streaming nature, recurrent neural network transducer (RNN-T) is a very
promising end-to-end (E2E) model that may replace the popular hybrid model for automatic …