Audioldm 2: Learning holistic audio generation with self-supervised pretraining H Liu, Y Yuan, X Liu, X Mei, Q Kong, Q Tian, Y Wang, W Wang, Y Wang, ... IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024 | 127 | 2024 |
Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, and Mark D Plumbley. Audioldm 2: Learning holistic audio generation with self-supervised pretraining H Liu, Q Tian, Y Yuan, X Liu, X Mei arXiv preprint arXiv:2308.05734 8 (1), 2023 | 89 | 2023 |
Efficient neural music generation MWY Lam, Q Tian, T Li, Z Yin, S Feng, M Tu, Y Ji, R Xia, M Ma, X Song, ... Advances in Neural Information Processing Systems 36, 2024 | 56 | 2024 |
Voicefixer: A unified framework for high-fidelity speech restoration H Liu, X Liu, Q Kong, Q Tian, Y Zhao, DL Wang, C Huang, Y Wang arXiv preprint arXiv:2204.05841, 2022 | 49 | 2022 |
VoiceFixer: Toward general speech restoration with neural vocoder H Liu, Q Kong, Q Tian, Y Zhao, DL Wang, C Huang, Y Wang arXiv preprint arXiv:2109.13731, 2021 | 48 | 2021 |
Neural vocoder is all you need for speech super-resolution H Liu, W Choi, X Liu, Q Kong, Q Tian, DL Wang arXiv preprint arXiv:2203.14941, 2022 | 42 | 2022 |
Neural dubber: Dubbing for videos according to scripts C Hu, Q Tian, T Li, W Yuping, Y Wang, H Zhao Advances in neural information processing systems 34, 16582-16595, 2021 | 40 | 2021 |
TFGAN: Time and frequency domain based generative adversarial network for high-fidelity speech synthesis Q Tian, Y Chen, Z Zhang, H Lu, L Chen, L Xie, S Liu arXiv preprint arXiv:2011.12206, 2020 | 36 | 2020 |
Lm-vc: Zero-shot voice conversion via speech generation based on language models Z Wang, Y Chen, L Xie, Q Tian, Y Wang IEEE Signal Processing Letters, 2023 | 34 | 2023 |
Featherwave: An efficient high-fidelity neural vocoder with multi-band linear prediction Q Tian, Z Zhang, H Lu, LH Chen, S Liu arXiv preprint arXiv:2005.05551, 2020 | 34 | 2020 |
Adadurian: Few-shot adaptation for neural text-to-speech with durian Z Zhang, Q Tian, H Lu, LH Chen, S Liu arXiv preprint arXiv:2005.05642, 2020 | 32 | 2020 |
AudioSR: Versatile audio super-resolution at scale H Liu, K Chen, Q Tian, W Wang, MD Plumbley ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 25 | 2024 |
Neufa: Neural network based end-to-end forced alignment with bidirectional attention mechanism J Li, Y Meng, Z Wu, H Meng, Q Tian, Y Wang, Y Wang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 24 | 2022 |
PolyVoice: Language Models for Speech to Speech Translation Q Dong, Z Huang, C Xu, Y Zhao, K Wang, X Cheng, T Ko, Q Tian, T Li, ... arXiv preprint arXiv:2306.02982v2, 2023 | 23 | 2023 |
Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, and Mark D Plumbley. 2023. AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining H Liu, Q Tian, Y Yuan, X Liu, X Mei arXiv preprint arXiv:2308.05734 3, 2023 | 15 | 2023 |
Inferring speaking styles from multi-modal conversational context by multi-scale relational graph convolutional networks J Li, Y Meng, X Wu, Z Wu, J Jia, H Meng, Q Tian, Y Wang, Y Wang Proceedings of the 30th ACM International Conference on Multimedia, 5811-5820, 2022 | 15 | 2022 |
Controllable and lossless non-autoregressive end-to-end text-to-speech Z Liu, Q Tian, C Hu, X Liu, M Wu, Y Wang, H Zhao, Y Wang arXiv preprint arXiv:2207.06088, 2022 | 15 | 2022 |
Generative adversarial network based speaker adaptation for high fidelity WaveNet vocoder Q Tian, X Wan, S Liu arXiv preprint arXiv:1812.02339, 2018 | 13 | 2018 |
Qiuqiang Kong, Yuping Wang, Wenwu Wang, Yuxuan Wang, and Mark D H Liu, Q Tian, Y Yuan, X Liu, X Mei Plumbley. Audioldm 2, 0 | 12 | |
Cloning one’s voice using very limited data in the wild D Dai, Y Chen, L Chen, M Tu, L Liu, R Xia, Q Tian, Y Wang, Y Wang ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 11 | 2022 |