A review of deep learning techniques for speech processing
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …
learning. The use of multiple processing layers has enabled the creation of models capable …
Conversational agents in therapeutic interventions for neurodevelopmental disorders: a survey
Neurodevelopmental Disorders (NDD) are a group of conditions with onset in the
developmental period characterized by deficits in the cognitive and social areas …
developmental period characterized by deficits in the cognitive and social areas …
Efficient large language models: A survey
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …
tasks such as natural language understanding and language generation, and thus have the …
Longnet: Scaling transformers to 1,000,000,000 tokens
Scaling sequence length has become a critical demand in the era of large language models.
However, existing methods struggle with either computational complexity or model …
However, existing methods struggle with either computational complexity or model …
Modular deep learning
Transfer learning has recently become the dominant paradigm of machine learning. Pre-
trained models fine-tuned for downstream tasks achieve better performance with fewer …
trained models fine-tuned for downstream tasks achieve better performance with fewer …
NusaCrowd: Open source initiative for Indonesian NLP resources
We present NusaCrowd, a collaborative initiative to collect and unify existing resources for
Indonesian languages, including opening access to previously non-public resources …
Indonesian languages, including opening access to previously non-public resources …
Learning hierarchical cross-modal association for co-speech gesture generation
Generating speech-consistent body and gesture movements is a long-standing problem in
virtual avatar creation. Previous studies often synthesize pose movement in a holistic …
virtual avatar creation. Previous studies often synthesize pose movement in a holistic …
Deep transfer learning for automatic speech recognition: Towards better generalization
Automatic speech recognition (ASR) has recently become an important challenge when
using deep learning (DL). It requires large-scale training datasets and high computational …
using deep learning (DL). It requires large-scale training datasets and high computational …
Transformers in speech processing: A survey
The remarkable success of transformers in the field of natural language processing has
sparked the interest of the speech-processing community, leading to an exploration of their …
sparked the interest of the speech-processing community, leading to an exploration of their …
Audio-driven co-speech gesture video generation
Co-speech gesture is crucial for human-machine interaction and digital entertainment. While
previous works mostly map speech audio to human skeletons (eg, 2D keypoints), directly …
previous works mostly map speech audio to human skeletons (eg, 2D keypoints), directly …