Vall-t: Decoder-only generative transducer for robust and decoding-controllable text-to-speech C Du, Y Guo, H Wang, Y Yang, Z Niu, S Wang, H Zhang, X Chen, K Yu arXiv preprint arXiv:2401.14321, 2024 | 21 | 2024 |
Attention-Constrained Inference For Robust Decoder-Only Text-to-Speech H Wang, C Du, Y Guo, S Wang, X Chen, K Yu 2024 IEEE Spoken Language Technology Workshop (SLT), 630-637, 2024 | 2 | 2024 |
vec2wav 2.0: Advancing Voice Conversion via Discrete Token Vocoders Y Guo, Z Li, J Li, C Du, H Wang, S Wang, X Chen, K Yu arXiv preprint arXiv:2409.01995, 2024 | 2 | 2024 |
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective H Wang, H Wang, Y Guo, Z Li, C Du, X Chen, K Yu arXiv preprint arXiv:2412.17048, 2024 | 1 | 2024 |
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge Y Guo, C Wang, Y Yang, H Wang, Z Ma, C Du, S Wang, H Li, X Li, S Fan, ... 2024 IEEE 14th International Symposium on Chinese Spoken Language Processing …, 2024 | 1 | 2024 |
LSCodec: Low-Bitrate and Speaker-Decoupled Discrete Speech Codec Y Guo, Z Li, C Du, H Wang, X Chen, K Yu arXiv preprint arXiv:2410.15764, 2024 | 1 | 2024 |
Recent Advances in Discrete Speech Tokens: A Review Y Guo, Z Li, H Wang, B Li, C Shao, H Zhang, C Du, X Chen, S Liu, K Yu arXiv preprint arXiv:2502.06490, 2025 | | 2025 |
AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures S Zhang, H Wang, D Ma, Z Zhu, L Chen, K Lan, K Yu arXiv preprint arXiv:2412.18910, 2024 | | 2024 |
Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding B Li, H Wang, S Zhang, Y Guo, K Yu arXiv preprint arXiv:2410.21951, 2024 | | 2024 |
Investigation on Training Strategy for Cross-Modal Large Language Models with Speech and Text H Zheng, Y Liu, H Wang, K Yu Man-Machine Speech Communication: 19th National Conference, NCMMSC 2024 …, 2024 | | 2024 |