متابعة
Zhen Ye
Zhen Ye
بريد إلكتروني تم التحقق منه على connect.ust.hk - الصفحة الرئيسية
عنوان
عدد مرات الاقتباسات
عدد مرات الاقتباسات
السنة
BLVD: Building a large-scale 5D semantics benchmark for autonomous driving
J Xue, J Fang, T Li, B Zhang, P Zhang, Z Ye, J Dou
2019 International Conference on Robotics and Automation (ICRA), 6685-6691, 2019
722019
Comospeech: One-step speech and singing voice synthesis via consistency model
Z Ye, W Xue, X Tan, J Chen, Q Liu, Y Guo
Proceedings of the 31st ACM International Conference on Multimedia, 1831-1839, 2023
372023
FlashSpeech: Efficient Zero-Shot Speech Synthesis
Z Ye, Z Ju, H Liu, X Tan, J Chen, Y Lu, P Sun, J Pan, W Bian, S He, W Xue, ...
ACM MM 2024, 2024
122024
Comosvc: Consistency model-based singing voice conversion
Y Lu, Z Ye, W Xue, X Tan, Q Liu, Y Guo
2024 IEEE 14th International Symposium on Chinese Spoken Language Processing …, 2024
102024
Mfc-bench: Benchmarking multimodal fact-checking with large vision-language models
S Wang, H Lin, Z Luo, Z Ye, G Chen, J Ma
arXiv preprint arXiv:2406.11288, 2024
52024
NAS-FM: neural architecture search for tunable and interpretable sound synthesis based on frequency modulation
Z Ye, W Xue, X Tan, Q Liu, Y Guo
IJCAI 2023, 2023
52023
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model
Z Ye, P Sun, J Lei, H Lin, X Tan, Z Dai, Q Kong, J Chen, J Pan, Q Liu, ...
AAAI 2025, 2024
42024
FastSAG: Towards Fast Non-Autoregressive Singing Accompaniment Generation
J Chen, W Xue, X Tan, Z Ye, Q Liu, Y Guo
IJCAI 2024, 2024
22024
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis
Z Ye, X Zhu, CM Chan, X Wang, X Tan, J Lei, Y Peng, H Liu, Y Jin, Z DAI, ...
arXiv preprint arXiv:2502.04128, 2025
12025
Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation
P Sun, S Cheng, X Li, Z Ye, H Liu, H Zhang, W Xue, Y Guo
arXiv preprint arXiv:2410.10676, 2024
12024
ScratchEval: Are GPT-4o Smarter than My Child? Evaluating Large Multimodal Models with Visual Programming Challenges
R Fu, Z Luo, H Lin, Z Ye, J Ma
arXiv preprint arXiv:2411.18932, 2024
2024
PyramidCodec: Hierarchical Codec for Long-form Music Generation in Audio Domain
J Chen, Z Dai, Z Ye, X Tan, Q Liu, Y Guo, W Xue
Findings of the Association for Computational Linguistics: EMNLP 2024, 4253-4263, 2024
2024
يتعذر على النظام إجراء العملية في الوقت الحالي. عاود المحاولة لاحقًا.
مقالات 1–12