Takip et
Xinfa Zhu
Xinfa Zhu
mail.nwpu.edu.cn üzerinde doğrulanmış e-posta adresine sahip
Başlık
Alıntı yapanlar
Alıntı yapanlar
Yıl
Multi-speaker expressive speech synthesis via multiple factors decoupling
X Zhu, Y Lei, K Song, Y Zhang, T Li, L Xie
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
202023
SELM: Speech enhancement using discrete tokens and language models
Z Wang, X Zhu, Z Zhang, YJ Lv, N Jiang, G Zhao, L Xie
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
192024
Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation
H Li, L Xue, H Guo, X Zhu, Y Lv, L Xie, Y Chen, H Yin, Z Li
arXiv preprint arXiv:2406.07422, 2024
182024
Cross-speaker emotion transfer through information perturbation in emotional speech synthesis
Y Lei, S Yang, X Zhu, L Xie, D Su
IEEE Signal Processing Letters 29, 1948-1952, 2022
182022
Metts: Multilingual emotional text-to-speech by cross-speaker and cross-lingual emotion transfer
X Zhu, Y Lei, T Li, Y Zhang, H Zhou, H Lu, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
142024
Vec-tok speech: Speech vectorization and tokenization for neural speech generation
X Zhu, Y Lv, Y Lei, T Li, W He, H Zhou, H Lu, L Xie
arXiv preprint arXiv:2310.07246, 2023
112023
DiCLET-TTS: Diffusion model based cross-lingual emotion transfer for text-to-speech—A study between English and Mandarin
T Li, C Hu, J Cong, X Zhu, J Li, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023
102023
Unistyle: Unified style modeling for speaking style captioning and stylistic speech synthesis
X Zhu, W Tian, X Wang, L He, Y Xiao, X Wang, X Tan, S Zhao, L Xie
Proceedings of the 32nd ACM International Conference on Multimedia, 7513-7522, 2024
52024
SponTTS: modeling and transferring spontaneous style for TTS
H Li, X Zhu, L Xue, Y Song, Y Chen, L Xie
ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024
52024
Accent-VITS: accent transfer for end-to-end TTS
L Ma, Y Zhang, X Zhu, Y Lei, Z Ning, P Zhu, L Xie
National Conference on Man-Machine Speech Communication, 203-214, 2023
42023
Contrastive context-speech pretraining for expressive text-to-speech synthesis
Y Xiao, X Wang, X Tan, L He, X Zhu, S Zhao, T Lee
Proceedings of the 32nd ACM International Conference on Multimedia, 2099-2107, 2024
22024
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy
L Ma, X Zhu, Y Lv, Z Wang, Z Wang, W He, H Zhou, L Xie
arXiv preprint arXiv:2406.09844, 2024
22024
HIGNN-TTS: Hierarchical Prosody Modeling With Graph Neural Networks for Expressive Long-Form TTS
D Guo, X Zhu, L Xue, T Li, Y Lv, Y Jiang, L Xie
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-7, 2023
22023
Zero-Shot Emotion Transfer for Cross-Lingual Speech Synthesis
Y Li, X Zhu, Y Lei, H Li, J Liu, D Xie, L Xie
2023 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 1-8, 2023
22023
The NPU-HWC System for the ISCSLP 2024 Inspirational and Convincing Audio Generation Challenge
D Guo, J Yao, X Zhu, K Xia, Z Guo, Z Zhang, Y Wang, J Liu, L Xie
2024 IEEE 14th International Symposium on Chinese Spoken Language Processing …, 2024
12024
U-Style: Cascading U-nets with Multi-level Speaker and Style Modeling for Zero-Shot Voice Cloning
T Li, Z Wang, X Zhu, J Cong, Q Tian, Y Wang, L Xie
IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2024
12024
Multi-Speaker Expressive Speech Synthesis via Semi-supervised Contrastive Learning
X Zhu, Y Li, Y Lei, N Jiang, G Zhao, L Xie
arXiv preprint arXiv:2310.17101, 2023
12023
CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions
X Zhu, W Tian, X Wang, L He, X Wang, S Zhao, L Xie
arXiv preprint arXiv:2501.16761, 2025
2025
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia
X Geng, K Wei, Q Shao, S Liu, Z Lin, Z Zhao, G Li, W Tian, P Chen, Y Li, ...
arXiv preprint arXiv:2501.13306, 2025
2025
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training
X Zhu, L He, Y Xiao, X Wang, X Tan, S Zhao, L Xie
arXiv preprint arXiv:2501.04416, 2025
2025
Sistem, işlemi şu anda gerçekleştiremiyor. Daha sonra yeniden deneyin.
Makaleler 1–20