Adapting and controlling DNN-based speech synthesis using input codes HT Luong, S Takaki, GE Henter, J Yamagishi 2017 IEEE International conference on acoustics, speech and signal …, 2017 | 107 | 2017 |
Wasserstein GAN and waveform loss-based acoustic model training for multi-speaker text-to-speech synthesis systems using a WaveNet vocoder Y Zhao, S Takaki, HT Luong, J Yamagishi, D Saito, N Minematsu IEEE access 6, 60478-60488, 2018 | 75 | 2018 |
Nautilus: a versatile voice cloning system HT Luong, J Yamagishi IEEE/ACM Transactions on Audio, Speech, and Language Processing 28, 2967-2981, 2020 | 61 | 2020 |
A non-expert Kaldi recipe for Vietnamese speech recognition system HT Luong, HQ Vu Proceedings of the Third International Workshop on Worldwide Language …, 2016 | 47 | 2016 |
Training multi-speaker neural text-to-speech systems using speaker-imbalanced speech corpora HT Luong, X Wang, J Yamagishi, N Nishizawa arXiv preprint arXiv:1904.00771, 2019 | 31 | 2019 |
Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech HT Luong, J Yamagishi 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 27 | 2019 |
Investigating accuracy of pitch-accent annotations in neural network-based speech synthesis and denoising effects HT Luong, X Wang, J Yamagishi, N Nishizawa arXiv preprint arXiv:1808.00665, 2018 | 24 | 2018 |
Scaling and bias codes for modeling speaker-adaptive DNN-based speech synthesis systems HT Luong, J Yamagishi 2018 IEEE Spoken Language Technology Workshop (SLT), 610-617, 2018 | 12 | 2018 |
Multimodal speech synthesis architecture for unsupervised speaker adaptation HT Luong, J Yamagishi arXiv preprint arXiv:1808.06288, 2018 | 11 | 2018 |
LaughNet: synthesizing laughter utterances from waveform silhouettes and a single laughter example HT Luong, J Yamagishi arXiv preprint arXiv:2110.04946, 2021 | 10 | 2021 |
A Unified Speaker Adaptation Method for Speech Synthesis using Transcribed and Untranscribed Speech with Backpropagation HT Luong, J Yamagishi arXiv preprint arXiv:1906.07414, 2019 | 9 | 2019 |
Temporal-channel modeling in multi-head self-attention for synthetic speech detection DT Truong, R Tao, T Nguyen, HT Luong, KA Lee, ES Chng arXiv preprint arXiv:2406.17376, 2024 | 7 | 2024 |
Latent linguistic embedding for cross-lingual text-to-speech and voice conversion HT Luong, J Yamagishi arXiv preprint arXiv:2010.03717, 2020 | 3 | 2020 |
NTU-NPU System for Voice Privacy 2024 Challenge N Kuzmin, HT Luong, J Yao, L Xie, KA Lee, ES Chng arXiv preprint arXiv:2410.02371, 2024 | 1 | 2024 |
Controlling Multi-Class Human Vocalization Generation via a Simple Segment-based Labeling Scheme HT Luong, J Yamagishi Proc. Interspeech 2023, 4379-4383, 2023 | 1 | 2023 |
A DNN-based text-to-speech synthesis system using speaker, gender, and age codes HT Luong, S Takaki, SJ Kim, J Yamagishi Journal of the Acoustical Society of America 140 (4_Supplement), 2962-2962, 2016 | 1 | 2016 |
Room Impulse Responses Help Attackers to Evade Deep Fake Detection HT Luong, DT Truong, KA Lee, ES Chng 2024 IEEE Spoken Language Technology Workshop (SLT), 623-629, 2024 | | 2024 |
LlamaPartialSpoof: An LLM-Driven Fake Speech Dataset Simulating Disinformation Generation HT Luong, H Li, L Zhang, KA Lee, ES Chng arXiv preprint arXiv:2409.14743, 2024 | | 2024 |
Preliminary study on using vector quantization latent spaces for TTS/VC systems with consistent performance HT Luong, J Yamagishi arXiv preprint arXiv:2106.13479, 2021 | | 2021 |
Deep learning based voice cloning framework for a unified system of text-to-speech and voice conversion L Hieu-Thi, HT Luong 総合研究大学院大学, 2020 | | 2020 |