Hierarchical conditional end-to-end asr with ctc and multi-granular subword units
In end-to-end automatic speech recognition (ASR), a model is expected to implicitly learn
representations suitable for recognizing a word-level sequence. However, the huge …
representations suitable for recognizing a word-level sequence. However, the huge …
Disordered speech recognition considering low resources and abnormal articulation
The success of automatic speech recognition (ASR) benefits a great number of healthy
people, but not people with disorders. The speech disordered may truly need support from …
people, but not people with disorders. The speech disordered may truly need support from …
PhISANet: Phonetically Informed Speech Animation Network
Realistic animation is crucial for immersive and seamless human-avatar interactions as
digital avatars become more prevalent. This work presents PhISANet, an encoder-decoder …
digital avatars become more prevalent. This work presents PhISANet, an encoder-decoder …
Reconnaissance automatique de la parole d'enfants apprenant· e· s lecteur· ice· s en salle de classe: modélisation acoustique de phonèmes
L Gelin - 2022 - theses.hal.science
À travers ces travaux de thèse, nous cherchons à perfectionner les transcriptions
phonétiques de lectures orales d'enfants apprenant· e· s lecteur· rice· s réalisées en …
phonétiques de lectures orales d'enfants apprenant· e· s lecteur· rice· s réalisées en …