A survey on deep reinforcement learning for audio-based applications

S Latif, H Cuayáhuitl, F Pervez, F Shamshad… - Artificial Intelligence …, 2023 - Springer
Deep reinforcement learning (DRL) is poised to revolutionise the field of artificial intelligence
(AI) by endowing autonomous systems with high levels of understanding of the real world …

Automatic spoken language acquisition based on observation and dialogue

R Komatsu, S Gao, W Hou, M Zhang… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Human babies are born without knowledge of any specific language. They acquire
language directly from observation and dialogue without being limited by the availability of …

[PDF][PDF] Sound-Image Grounding Based Focusing Mechanism for Efficient Automatic Spoken Language Acquisition.

M Zhang, T Tanaka, W Hou, S Gao, T Shinozaki - Interspeech, 2020 - interspeech2020.org
The process of spoken language acquisition based on soundimage grounding has been
one of the topics that has attracted the most significant interest of linguists and human …

Continuous action space-based spoken language acquisition agent using residual sentence embedding and transformer decoder

R Komatsu, Y Kimura, T Okamoto… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Studies on spoken language acquisition agents aim to understand the mechanism of human
language learning and to realize it on computers. Existing open vocabulary agents first …

[PDF][PDF] Pronunciation adaptive self speaking agent using wavegrad

T Tanaka, R Komatsu, T Okamoto… - Proc. AAAI …, 2022 - aaai-sas-2022.github.io
The ability to automatically learn to speak through observation and dialogue without relying
on labeled training data is essential for intelligent robots or agents to flexibly and …

Unsupervised spoken term discovery using wav2vec 2.0

Y Iwamoto, T Shinozaki - 2021 Asia-Pacific Signal and …, 2021 - ieeexplore.ieee.org
Unsupervised spoken term discovery is the task of finding recurring word-like patterns from
raw audio without any manual transcription. Several approaches have been investigated …

Unsupervised multilingual models of speech representation, an approach inspired by cognitive science

M de Seyssel - 2023 - theses.hal.science
Speech, serving as a key input in the early language acquisition process, carries different
types of information. This includes linguistic information-denoting the inherent meaning of …

Self-supervised spoken question understanding and speaking with automatic vocabulary learning

K Toyoda, Y Kimura, M Zhang, K Hino… - … 24th Conference of …, 2021 - ieeexplore.ieee.org
Spoken language acquisition involves automatically develo** symbolic word concepts
grounding their meaning to the world, recognizing the words in spoken utterances, and …

Towards Learning to Speak and Hear Through Multi-Agent Communication over a Continuous Acoustic Channel

K Eloff, O Räsänen, HA Engelbrecht, A Pretorius… - arxiv preprint arxiv …, 2021 - arxiv.org
Multi-agent reinforcement learning has been used as an effective means to study emergent
communication between agents, yet little focus has been given to continuous acoustic …

Learning to speak and hear through multi-agent communication over a continuous acoustic channel

K Eloff - 2023 - scholar.sun.ac.za
Human infants acquire language in large part through continuous signalling with their
caregivers. By interacting and communicating with their caregivers, infants can observe the …