Revisiting hidden Markov models for speech emotion recognition S Mao, D Tao, G Zhang, PC Ching, T Lee ICASSP 2019-2019 IEEE international conference on acoustics, speech and …, 2019 | 98 | 2019 |
Adaspeech 3: Adaptive text to speech for spontaneous style Y Yan, X Tan, B Li, G Zhang, T Qin, S Zhao, Y Shen, WQ Zhang, TY Liu arXiv preprint arXiv:2107.02530, 2021 | 44 | 2021 |
iemotts: Toward robust cross-speaker emotion transfer and control for speech synthesis based on disentanglement between prosody and timbre G Zhang, Y Qin, W Zhang, J Wu, M Li, Y Gai, F Jiang, T Lee IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 1693-1705, 2023 | 28 | 2023 |
Mixed-phoneme BERT: Improving BERT with mixed phoneme and sup-phoneme representations for text to speech G Zhang, K Song, X Tan, D Tan, Y Yan, Y Liu, G Wang, W Zhou, T Qin, ... arXiv preprint arXiv:2203.17190, 2022 | 25 | 2022 |
Learning Syllable-Level Discrete Prosodic Representation for Expressive Speech Generation. G Zhang, Y Qin, T Lee Interspeech, 3426-3430, 2020 | 12 | 2020 |
Cuhk-ee voice cloning system for icassp 2021 m2voc challenge D Tan, H Huang, G Zhang, T Lee arXiv preprint arXiv:2103.04699, 2021 | 11 | 2021 |
Estimating mutual information in prosody representation for emotional prosody transfer in speech synthesis G Zhang, S Qiu, Y Qin, T Lee 2021 12th International Symposium on Chinese Spoken Language Processing …, 2021 | 9 | 2021 |
Environment aware text-to-speech synthesis D Tan, G Zhang, T Lee arXiv preprint arXiv:2110.03887, 2021 | 6 | 2021 |
Recent advances in speech language models: A survey W Cui, D Yu, X Jiao, Z Meng, G Zhang, Q Wang, Y Guo, I King arXiv preprint arXiv:2410.03751, 2024 | 5 | 2024 |
Applying the information bottleneck principle to prosodic representation learning G Zhang, Y Qin, D Tan, T Lee arXiv preprint arXiv:2108.02821, 2021 | 5 | 2021 |
Comparing normalizing flows and diffusion models for prosody and acoustic modelling in text-to-speech G Zhang, T Merritt, MS Ribeiro, B Tura-Vecino, K Yanagisawa, K Pokora, ... arXiv preprint arXiv:2307.16679, 2023 | 4 | 2023 |
Creating personalized synthetic voices from post-glossectomy speech with guided diffusion models Y Tian, G Zhang, T Lee arXiv preprint arXiv:2305.17436, 2023 | 2 | 2023 |
A study on the efficacy of model pre-training in developing neural text-to-speech system G Zhang, Y Leng, D Tan, Y Qin, K Song, X Tan, S Zhao, T Lee ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 2 | 2022 |
Chinese nouns are mass nouns: An information-theoretic computational proof W Zhou, G Zhang, Y Chen Lingua 311, 103815, 2024 | 1 | 2024 |
Enabling Beam Search for Language Model-Based Text-to-Speech Synthesis Z Tu, G Zhang, Y Lu, A Adigwe, S King, Y Guo arXiv preprint arXiv:2408.16373, 2024 | | 2024 |
Chinese Nouns are Mass Nouns: An Information-Theoretic Computational Validation W Zhou, G Zhang, Y Chen Available at SSRN 4674220, 0 | | |