Front-end architecture for a multi-lingual text-to-speech system M Chu, H Peng, Y Zhao US Patent 7,496,498, 2009 | 399 | 2009 |
Providing personalized voice font for text-to-speech applications M Chu, Y Zhao, S Zhao US Patent 7,693,719, 2010 | 355 | 2010 |
Refining of segmental boundaries in speech waveforms using contextual-dependent models Y Zhao, M Chu, J Zhou, L Wang US Patent 7,496,512, 2009 | 341 | 2009 |
Unnatural prosody detection in speech synthesis Y Zhao, FKP Soong, M Chu, L Wang US Patent 8,583,438, 2013 | 265 | 2013 |
Unnatural prosody detection in speech synthesis Y Zhao, FKP Soong, M Chu, L Wang US Patent 8,583,438, 2013 | 265 | 2013 |
Speech unit selection using HMM acoustic models M Chu, P Liu, Y Zhao, Y Li US Patent App. 11/508,093, 2008 | 249 | 2008 |
Defining atom units between phone and syllable for TTS systems M Chu, Y Zhao US Patent 7,418,389, 2008 | 227 | 2008 |
End-to-end attention based text-dependent speaker verification SX Zhang, Z Chen, Y Zhao, J Li, Y Gong 2016 IEEE Spoken Language Technology Workshop (SLT), 171-178, 2016 | 208 | 2016 |
ResNeXt and Res2Net structures for speaker verification T Zhou, Y Zhao, J Wu 2021 IEEE Spoken Language Technology Workshop (SLT), 301-307, 2021 | 201 | 2021 |
Optimization of an objective measure for estimating mean opinion score of synthesized speech M Chu, H Peng, Y Zhao US Patent 7,386,451, 2008 | 180 | 2008 |
Speaker-invariant training via adversarial learning Z Meng, J Li, Z Chen, Y Zhao, V Mazalov, Y Gong, BH Juang 2018 IEEE International Conference on Acoustics, Speech and Signal …, 2018 | 140 | 2018 |
Conditional teacher-student learning Z Meng, J Li, Y Zhao, Y Gong ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 126 | 2019 |
Microsoft Mulan-a bilingual TTS system M Chu, H Peng, Y Zhao, Z Niu, E Chang 2003 IEEE International Conference on Acoustics, Speech, and Signal …, 2003 | 93 | 2003 |
Adversarial speaker verification Z Meng, Y Zhao, J Li, Y Gong ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019 | 91 | 2019 |
Advances in online audio-visual meeting transcription T Yoshioka, I Abramovski, C Aksoylar, Z Chen, M David, D Dimitriadis, ... 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 90 | 2019 |
Microsoft speaker diarization system for the voxceleb speaker recognition challenge 2020 X Xiao, N Kanda, Z Chen, T Zhou, T Yoshioka, S Chen, Y Zhao, G Liu, ... arXiv preprint arXiv:2010.11458, 2020 | 83 | 2020 |
Low-rank plus diagonal adaptation for deep neural networks Y Zhao, J Li, Y Gong 2016 IEEE International Conference on Acoustics, Speech and Signal …, 2016 | 77 | 2016 |
Cnn with phonetic attention for text-independent speaker verification T Zhou, Y Zhao, J Li, Y Gong, J Wu 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU …, 2019 | 70 | 2019 |
Improving Deep CNN Networks with Long Temporal Context for Text-Independent Speaker Verification Y Zhao, T Zhou, Z Chen, J Wu ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 49 | 2020 |
Refining segmental boundaries for TTS database using fine contextual-dependent boundary models L Wang, Y Zhao, M Chu, J Zhou, Z Cao 2004 IEEE International Conference on Acoustics, Speech, and Signal …, 2004 | 48 | 2004 |