Diff-tts: A denoising diffusion model for text-to-speech M Jeong, H Kim, SJ Cheon, BJ Choi, NS Kim arXiv preprint arXiv:2104.01409, 2021 | 210 | 2021 |
Expressive text-to-speech using style tag M Kim, SJ Cheon, BJ Choi, JJ Kim, NS Kim arXiv preprint arXiv:2104.00436, 2021 | 60 | 2021 |
Transfer learning framework for low-resource text-to-speech using a large-scale unlabeled speech corpus M Kim, M Jeong, BJ Choi, S Ahn, JY Lee, NS Kim arXiv preprint arXiv:2203.15447, 2022 | 29 | 2022 |
SNAC: Speaker-normalized affine coupling layer in flow-based architecture for zero-shot multi-speaker text-to-speech BJ Choi, M Jeong, JY Lee, NS Kim IEEE Signal Processing Letters 29, 2502-2506, 2022 | 17 | 2022 |
WaveNODE: A continuous normalizing flow for speech synthesis H Kim, H Lee, WH Kang, SJ Cheon, BJ Choi, NS Kim arXiv preprint arXiv:2006.04598, 2020 | 14 | 2020 |
Reformer-TTS: Neural Speech Synthesis with Reformer Network. HR Ihm, JY Lee, BJ Choi, SJ Cheon, NS Kim INTERSPEECH, 2012-2016, 2020 | 12 | 2020 |
Transduce and speak: Neural transducer for text-to-speech with semantic token prediction M Kim, M Jeong, BJ Choi, D Lee, NS Kim 2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023 | 11 | 2023 |
Acoustic Modeling Using Adversarially Trained Variational Recurrent Neural Network for Speech Synthesis. JY Lee, SJ Cheon, BJ Choi, NS Kim, E Song INTERSPEECH, 917-921, 2018 | 8 | 2018 |
Adversarial speaker-consistency learning using untranscribed speech data for zero-shot multi-speaker text-to-speech BJ Choi, M Jeong, M Kim, SH Mun, NS Kim 2022 Asia-Pacific Signal and Information Processing Association Annual …, 2022 | 5 | 2022 |
Gated recurrent attention for multi-style speech synthesis SJ Cheon, JY Lee, BJ Choi, H Lee, NS Kim Applied Sciences 10 (15), 5325, 2020 | 5 | 2020 |
Memory attention: Robust alignment using gating mechanism for end-to-end speech synthesis JY Lee, SJ Cheon, BJ Choi, NS Kim IEEE Signal Processing Letters 27, 2004-2008, 2020 | 4 | 2020 |
Utilizing Neural Transducers for Two-Stage Text-to-Speech via Semantic Token Prediction M Kim, M Jeong, BJ Choi, S Kim, JY Lee, NS Kim arXiv preprint arXiv:2401.01498, 2024 | 3 | 2024 |
Transfer Learning for Low-Resource, Multi-Lingual, and Zero-Shot Multi-Speaker Text-to-Speech M Jeong, M Kim, BJ Choi, J Yoon, W Jang, NS Kim IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 2 | 2024 |
A controllable multi-lingual multi-speaker multi-style text-to-speech synthesis with multivariate information minimization SJ Cheon, BJ Choi, M Kim, H Lee, NS Kim IEEE Signal Processing Letters 29, 55-59, 2021 | 2 | 2021 |
MakeSinger: A Semi-Supervised Training Method for Data-Efficient Singing Voice Synthesis via Classifier-free Diffusion Guidance S Kim, M Jeong, H Lee, M Kim, BJ Choi, NS Kim arXiv preprint arXiv:2406.05965, 2024 | 1 | 2024 |
Variable-Length Speaker Conditioning in Flow-Based Text-to-Speech BJ Choi, M Jeong, M Kim, NS Kim IEEE Signal Processing Letters, 2024 | 1 | 2024 |