TTS synthesis with bidirectional LSTM based recurrent neural networks Y Fan, Y Qian, FL Xie, FK Soong Fifteenth annual conference of the international speech communication …, 2014 | 638 | 2014 |
A KL divergence and DNN-based approach to voice conversion without parallel training sentences. FL Xie, FK Soong, H Li Interspeech, 287-291, 2016 | 88 | 2016 |
Sequence Error (SE) Minimization Training of Neural Network for Voice Conversion HL Feng-Long Xie, Yao Qian, Frank K. Soong INTERSPEECH, 2014 | 53 | 2014 |
A KL divergence and DNN approach to cross-lingual TTS FL Xie, FK Soong, H Li 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 31 | 2016 |
Improving end-to-end speech synthesis with local recurrent neural network enhanced transformer Y Zheng, X Li, F Xie, L Lu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 28 | 2020 |
MSMC-TTS: Multi-stage multi-codebook VQ-VAE based neural TTS H Guo, F Xie, X Wu, FK Soong, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1811-1824, 2023 | 14 | 2023 |
A multi-stage multi-codebook VQ-VAE approach to high-performance neural TTS H Guo, F Xie, FK Soong, X Wu, H Meng arXiv preprint arXiv:2209.10887, 2022 | 12 | 2022 |
Pitch transformation in neural network based voice conversion FL Xie, Y Qian, FK Soong, H Li The 9th International Symposium on Chinese Spoken Language Processing, 197-200, 2014 | 12 | 2014 |
Voice conversion with SI-DNN and KL divergence based mapping without parallel training data FL Xie, FK Soong, H Li Speech Communication 106, 57-67, 2019 | 10 | 2019 |
Fireredtts: A foundation text-to-speech framework for industry-level generative speech applications HH Guo, K Liu, FY Shen, YC Wu, FL Xie, K Xie, KT Xu arXiv preprint arXiv:2409.03283, 2024 | 9 | 2024 |
Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder H Guo, F Xie, D Yang, H Lu, X Wu, H Meng arXiv preprint arXiv:2406.02940, 2024 | 5 | 2024 |
Towards High-Quality Neural TTS for Low-Resource Languages by Learning Compact Speech Representations H Guo, F Xie, X Wu, H Lu, H Meng arXiv preprint arXiv:2210.15131, 2022 | 5 | 2022 |
Triple M: a practical text-to-speech synthesis system with multi-guidance attention and multi-band multi-time LPCNet S Lin, F Xie, L Meng, X Li, L Lu arXiv preprint arXiv:2102.00247, 2021 | 5 | 2021 |
Socodec: A semantic-ordered multi-stream speech codec for efficient language model based text-to-speech synthesis H Guo, F Xie, K Xie, D Yang, D Guo, X Wu, H Meng 2024 IEEE Spoken Language Technology Workshop (SLT), 645-651, 2024 | 4 | 2024 |
Tri-stage training with language-specific encoder and bilingual acoustic learner for code-switching speech recognition X Wang, Y Jin, F Xie, Y Long Applied Acoustics 218, 109883, 2024 | 4 | 2024 |
QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning H Guo, F Xie, J Kang, Y Xiao, X Wu, H Meng IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 3 | 2024 |
A new high quality trajectory tiling based hybrid TTS in real time FL Xie, XH Li, WC Su, L Lu, FK Soong ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 3 | 2021 |
Nana-HDR: A non-attentive non-autoregressive hybrid model for TTS S Lin, W Su, L Meng, F Xie, X Li, L Lu arXiv preprint arXiv:2109.13673, 2021 | 2 | 2021 |
An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data FL Xie, XH Li, B Liu, YB Zheng, L Meng, L Lu, FK Soong ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 2 | 2020 |
Cross Validation and Minimum Generation Error for improved model clustering in HMM-based TTS FKS Feng-Long Xie, Yi-Jian Wu ISCSLP, 2012 | 2* | 2012 |