‪Yihan Wu‬ - ‪Google 학술 검색‬

내 프로필 만들기

인용

	전체	2020년 이후
서지정보	259	259
h-index	7	7
i10-index	6	6

0

160

80

40

120

20222023202420256 71 160 22

공동 저자

Ruihua SongRenmin University of Chinaruc.edu.cn의 이메일 확인됨
Xu TanPrincipal Researcher and Research Manager, Microsoftmicrosoft.com의 이메일 확인됨
Shinji WatanabeCarnegie Mellon Universitycmu.edu의 이메일 확인됨
Jiatong Shi (史嘉彤)Carnegie Mellon Universityandrew.cmu.edu의 이메일 확인됨

Yihan Wu

Yihan Wu

Renmin University of China

ruc.edu.cn의 이메일 확인됨 - 홈페이지

Speech synthesis AI based creation multi-modality chitchat natural language understanding


제목 서지정보순 정렬 연도순 정렬 제목순 정렬	인용 인용	연도
PromptTTS: Controllable Text-to-Speech with Text Descriptions Z Guo, Y Leng, Y Wu, S Zhao, X Tan ICASSP 2023, 2022	101	2022
Adaspeech 4: Adaptive text to speech in zero-shot scenarios Y Wu, X Tan, B Li, L He, S Zhao, R Song, T Qin, TY Liu InterSpeech 2022, 2022	72	2022
Resgrad: Residual denoising diffusion probabilistic models for text to speech Z Chen, Y Wu, Y Leng, J Chen, H Liu, X Tan, Y Cui, K Wang, L He, S Zhao, ... arXiv preprint arXiv:2212.14518, 2022	21	2022
VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing Y Wu, J Guo, X Tan, C Zhang, B Li, R Song, L He, S Zhao, A Menezes, ... AAAI 2023, 2022	16	2022
Self-supervised context-aware style representation for expressive speech synthesis Y Wu, X Wang, S Zhang, L He, R Song, JY Nie InterSpeech 2022, 2022	16	2022
The Interspeech 2024 challenge on speech processing using discrete units X Chang, J Shi, J Tian, Y Wu, Y Tang, Y Wu, S Watanabe, Y Adi, X Chen, ... arXiv preprint arXiv:2406.07725, 2024	13	2024
Espnet-codec: Comprehensive training and evaluation of neural codecs for audio, music, and speech J Shi, J Tian, Y Wu, J Jung, JQ Yip, Y Masuyama, W Chen, Y Wu, Y Tang, ... 2024 IEEE Spoken Language Technology Workshop (SLT), 562-569, 2024	7	2024
Tiva: Time-aligned video-to-audio generation X Wang, Y Wang, Y Wu, R Song, X Tan, Z Chen, H Xu, G Sui Proceedings of the 32nd ACM International Conference on Multimedia, 573-582, 2024	5	2024
Yulan: An open-source large language model Y Zhu, K Zhou, K Mao, W Chen, Y Sun, Z Chen, Q Cao, Y Wu, Y Chen, ... arXiv preprint arXiv:2406.19853, 2024	2	2024
Speechcomposer: Unifying multiple speech tasks with prompt composition Y Wu, S Maiti, Y Peng, W Zhang, C Li, Y Wang, X Wang, S Watanabe, ... arXiv preprint arXiv:2401.18045, 2024	2	2024
Robust Audiovisual Speech Recognition Models with Mixture-of-Experts Y Wu, Y Peng, Y Lu, X Chang, R Song, S Watanabe 2024 IEEE Spoken Language Technology Workshop (SLT), 43-48, 2024	1	2024
LoVA: Long-form Video-to-Audio Generation X Cheng, X Wang, Y Wu, Y Wang, R Song arXiv preprint arXiv:2409.15157, 2024	1	2024
Text-to-speech synthesis in the wild J Jung, W Zhang, S Maiti, Y Wu, X Wang, JH Kim, Y Matsunaga, S Um, ... arXiv preprint arXiv:2409.08711, 2024	1	2024
Understanding Human Preferences: Towards More Personalized Video to Text Generation Y Wu, R Song, X Chen, H Jiang, Z Cao, J Yu Proceedings of the ACM Web Conference 2024, 3952-3963, 2024	1	2024
SpoofCeleb: Speech Deepfake Detection and SASV In The Wild J Jung, Y Wu, X Wang, JH Kim, S Maiti, Y Matsunaga, H Shim, J Tian, ... IEEE Open Journal of Signal Processing, 2025		2025
Enhancing Audiovisual Speech Recognition through Bifocal Preference Optimization Y Wu, Y Lu, Y Peng, X Wang, R Song, S Watanabe arXiv preprint arXiv:2412.19005, 2024		2024
ComedicSpeech: Text To Speech For Stand-up Comedies in Low-Resource Scenarios Y Wang, H Xiao, Y Wu, R Song InterSpeech 2023, 2023		2023

현재 시스템이 작동되지 않습니다. 나중에 다시 시도해 주세요.

학술자료 1–17