A battle of network structures: An empirical study of cnn, transformer, and mlp Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha arXiv preprint arXiv:2108.13002, 2021 | 114 | 2021 |
Sparse MLP for image recognition: Is self-attention really necessary? C Tang, Y Zhao, G Wang, C Luo, W Xie, W Zeng Proceedings of the AAAI conference on artificial intelligence 36 (2), 2344-2351, 2022 | 109 | 2022 |
Joint time-frequency and time domain learning for speech enhancement C Tang, C Luo, Z Zhao, W Xie, W Zeng Proceedings of the twenty-ninth international conference on international …, 2021 | 80 | 2021 |
When shift operation meets vision transformer: An extremely simple alternative to attention mechanism G Wang, Y Zhao, C Tang, C Luo, W Zeng Proceedings of the AAAI Conference on Artificial Intelligence 36 (2), 2423-2430, 2022 | 75 | 2022 |
Look before you match: Instance understanding matters in video object segmentation J Wang, D Chen, Z Wu, C Luo, C Tang, X Dai, Y Zhao, Y Xie, L Yuan, ... Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2023 | 50 | 2023 |
Streaming video model Y Zhao, C Luo, C Tang, D Chen, N Codella, ZJ Zha Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 14 | 2023 |
TridentSE: Guiding speech enhancement with 32 global tokens D Yin, Z Zhao, C Tang, Z Xiong, C Luo arXiv preprint arXiv:2210.12995, 2022 | 14 | 2022 |
Retrievertts: Modeling decomposed factors for text-based speech insertion D Yin, C Tang, Y Liu, X Wang, Z Zhao, Y Zhao, Z Xiong, S Zhao, C Luo arXiv preprint arXiv:2206.13865, 2022 | 14 | 2022 |
A battle of network structures: An empirical study of cnn, transformer, and mlp. arXiv 2021 Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha arXiv preprint arXiv:2108.13002, 0 | 13 | |
Zero-shot text-to-speech for text-based insertion in audio narration C Tang, C Luo, Z Zhao, D Yin, Y Zhao, W Zeng arXiv preprint arXiv:2109.05426, 2021 | 9 | 2021 |
Microcinema: A divide-and-conquer approach for text-to-video generation Y Wang, J Bao, W Weng, R Feng, D Yin, T Yang, J Zhang, Q Dai, Z Zhao, ... Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 8 | 2024 |
General-purpose speech representation learning through a self-supervised multi-granularity framework Y Zhao, D Yin, C Luo, Z Zhao, C Tang, W Zeng, ZJ Zha arXiv preprint arXiv:2102.01930, 2021 | 8 | 2021 |
Method and system for video frame interpolation based on optical flow method T Chuanxin, R Wang, Z Wang, W Gao US Patent 10,531,093, 2020 | 8 | 2020 |
A new frame interpolation method with pixel-level motion vector field C Tang, R Wang, W Wang, W Gao 2014 IEEE Visual Communications and Image Processing Conference, 350-353, 2014 | 6 | 2014 |
A Battle of Network Structures: An Empirical Study of CNN Y Zhao, G Wang, C Tang, C Luo, W Zeng, ZJ Zha Transformer, and MLP. arXiv: abs/2108.13002, 2021 | 5 | 2021 |
An anchor-free detector for continuous speech keyword spotting Z Zhao, C Tang, C Yao, C Luo arXiv preprint arXiv:2208.04622, 2022 | 2 | 2022 |
Frame interpolation with pixel-level motion vector field and mesh based hole filling C Tang, R Wang, Z Li, W Wang, W Gao CAAI Transactions on Intelligence Technology 1 (1), 72-78, 2016 | 2 | 2016 |
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation L Zheng, Y Zhang, H Guo, J Pan, Z Tan, J Lu, C Tang, B An, S Yan arXiv preprint arXiv:2412.04448, 2024 | | 2024 |
Speech enhancement T Chuanxin, Z Zhao, C Luo, W Zeng US Patent App. 17/927,861, 2023 | | 2023 |
Filler Word Detection with Hard Category Mining and Inter-Category Focal Loss Z Zhao, L Wu, C Tang, D Yin, Y Zhao, C Luo ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023 | | 2023 |