HierSpeech: Bridging the gap between text and speech by hierarchical variational inference using self-supervised representations for speech synthesis SH Lee, SB Kim, JH Lee, E Song, MJ Hwang, SW Lee Advances in Neural Information Processing Systems 35, 16624-16636, 2022 | 48 | 2022 |
Emoq-tts: Emotion intensity quantization for fine-grained controllable emotional text-to-speech CB Im, SH Lee, SB Kim, SW Lee ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 45 | 2022 |
Hierspeech++: Bridging the gap between semantic and acoustic representation of speech by hierarchical variational inference for zero-shot speech synthesis SH Lee, HY Choi, SB Kim, SW Lee arXiv preprint arXiv:2311.12454, 2023 | 29 | 2023 |
Audio super-resolution with robust speech representation learning of masked autoencoder SB Kim, SH Lee, HY Choi, SW Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing 32, 1012-1022, 2024 | 12 | 2024 |
EmoSphere-TTS: Emotional style and intensity modeling via spherical emotion vector for controllable emotional text-to-speech DH Cho, HS Oh, SB Kim, SH Lee, SW Lee arXiv preprint arXiv:2406.07803, 2024 | 8 | 2024 |
TranSentence: Speech-to-speech Translation via Language-agnostic Sentence-level Speech Encoding without Language-parallel Data SB Kim, SH Lee, SW Lee ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 3 | 2024 |
EmoSphere++: Emotion-Controllable Zero-Shot Text-to-Speech via Emotion-Adaptive Spherical Vector DH Cho, HS Oh, SB Kim, SW Lee arXiv preprint arXiv:2411.02625, 2024 | 1 | 2024 |
JELLY: Joint Emotion Recognition and Context Reasoning with LLMs for Conversational Speech Synthesis JH Cha, SB Kim, HS Oh, SW Lee arXiv preprint arXiv:2501.04904, 2025 | | 2025 |
FLowHigh: Towards Efficient and High-Quality Audio Super-Resolution with Single-Step Flow Matching JH Yun, SB Kim, SW Lee arXiv preprint arXiv:2501.04926, 2025 | | 2025 |