Add 2022: the first audio deep synthesis detection challenge J Yi, R Fu, J Tao, S Nie, H Ma, C Wang, T Wang, Z Tian, Y Bai, C Fan, ... ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 205 | 2022 |
Add 2023: the second audio deepfake detection challenge J Yi, J Tao, R Fu, X Yan, C Wang, T Wang, CY Zhang, X Zhang, Y Zhao, ... arXiv preprint arXiv:2305.13774, 2023 | 111 | 2023 |
Half-truth: A partially fake audio detection dataset J Yi, Y Bai, J Tao, H Ma, Z Tian, C Wang, T Wang, R Fu arXiv preprint arXiv:2104.03617, 2021 | 92 | 2021 |
CFAD: A Chinese dataset for fake audio detection H Ma, J Yi, C Wang, X Yan, J Tao, T Wang, S Wang, R Fu Speech Communication 164, 103122, 2024 | 44 | 2024 |
An initial investigation for detecting vocoder fingerprints of fake audio X Yan, J Yi, J Tao, C Wang, H Ma, T Wang, S Wang, R Fu Proceedings of the 1st International Workshop on Deepfake Detection for …, 2022 | 29 | 2022 |
Fewer-token neural speech codec with time-invariant codes Y Ren, T Wang, J Yi, L Xu, J Tao, CY Zhang, J Zhou ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 28 | 2024 |
Spoken Content and Voice Factorization for Few-Shot Speaker Adaptation. T Wang, J Tao, R Fu, J Yi, Z Wen, R Zhong Interspeech, 796-800, 2020 | 27 | 2020 |
Prosody and voice factorization for few-shot speaker adaptation in the challenge m2voc 2021 T Wang, R Fu, J Yi, J Tao, Z Wen, C Qiang, S Wang ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and …, 2021 | 22 | 2021 |
Campnet: Context-aware mask prediction for end-to-end text-based speech editing T Wang, J Yi, R Fu, J Tao, Z Wen IEEE/ACM Transactions on Audio, Speech, and Language Processing 30, 2241-2254, 2022 | 20 | 2022 |
Focusing on attention: prosody transfer and adaptative optimization strategy for multi-speaker end-to-end speech synthesis R Fu, J Tao, Z Wen, J Yi, T Wang ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and …, 2020 | 13 | 2020 |
Emofake: An initial dataset for emotion fake audio detection Y Zhao, J Yi, J Tao, C Wang, Y Dong China National Conference on Chinese Computational Linguistics, 419-433, 2024 | 10 | 2024 |
Bi-Level Speaker Supervision for One-Shot Speech Synthesis. T Wang, J Tao, R Fu, J Yi, Z Wen, C Qiang Interspeech, 3989-3993, 2020 | 9 | 2020 |
Minimally-supervised speech synthesis with conditional diffusion model and language model: A comparative study of semantic coding C Qiang, H Li, H Ni, H Qu, R Fu, T Wang, L Wang, J Dang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 8 | 2024 |
Singing-Tacotron: Global duration control attention and dynamic filter for end-to-end singing voice synthesis T Wang, R Fu, J Yi, Z Wen, J Tao Proceedings of the 1st International Workshop on Deepfake Detection for …, 2022 | 8 | 2022 |
Dynamic Soft Windowing and Language Dependent Style Token for Code-Switching End-to-End Speech Synthesis. R Fu, J Tao, Z Wen, J Yi, C Qiang, T Wang INTERSPEECH, 2937-2941, 2020 | 8 | 2020 |
Context-aware mask prediction network for end-to-end text-based speech editing T Wang, J Yi, L Deng, R Fu, J Tao, Z Wen ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and …, 2022 | 6 | 2022 |
Non-Autoregressive End-to-End TTS with Coarse-to-Fine Decoding. T Wang, X Liu, J Tao, J Yi, R Fu, Z Wen INTERSPEECH, 3984-3988, 2020 | 6 | 2020 |
Learning speech representation from contrastive token-acoustic pretraining C Qiang, H Li, Y Tian, R Fu, T Wang, L Wang, J Dang ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and …, 2024 | 5 | 2024 |
Adversarial multi-task learning for mandarin prosodic boundary prediction with multi-modal embeddings J Yi, J Tao, R Fu, T Wang, CY Zhang, C Wang IEEE/ACM Transactions on Audio, Speech, and Language Processing 31, 2963-2973, 2023 | 4 | 2023 |
Unifyspeech: A unified framework for zero-shot text-to-speech and voice conversion H Liu, T Wang, R Fu, J Yi, Z Wen, J Tao arXiv preprint arXiv:2301.03801, 2023 | 3 | 2023 |