Mixspeech: Cross-modality self-learning with audio-visual stream mixup for visual speech translation and recognition X Cheng, T Jin, R Huang, L Li, W Lin, Z Wang, Y Wang, H Liu, A Yin, ... Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 23 | 2023 |
TAVT: Towards Transferable Audio-Visual Text Generation W Lin, T Jin, W Pan, L Li, X Cheng, Y Wang, Z Zhao Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 12 | 2023 |
Exploring group video captioning with efficient relational approximation W Lin, T Jin, Y Wang, W Pan, L Li, X Cheng, Z Zhao Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 10 | 2023 |
EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration Y Wang, J Xun, M Hong, J Zhu, T Jin, W Lin, H Li, L Li, Y Xia, Z Zhao, ... Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and …, 2024 | 8 | 2024 |
Rethinking Missing Modality Learning from a Decoding Perspective T Jin, X Cheng, L Li, W Lin, Y Wang, Z Zhao Proceedings of the 31st ACM International Conference on Multimedia, 4431-4439, 2023 | 8 | 2023 |
Weakly-supervised spoken video grounding via semantic interaction learning Y Wang, W Lin, S Zhang, T Jin, L Li, X Cheng, Z Zhao Proceedings of the 61st Annual Meeting of the Association for Computational …, 2023 | 7 | 2023 |
Contrastive token-wise meta-learning for unseen performer visual temporal-aligned translation L Li, T Jin, X Cheng, Y Wang, W Lin, R Huang, Z Zhao Findings of the Association for Computational Linguistics: ACL 2023, 10993-11007, 2023 | 6 | 2023 |
Semantic-conditioned dual adaptation for cross-domain query-based visual segmentation Y Wang, T Jin, W Lin, X Cheng, L Li, Z Zhao Findings of the Association for Computational Linguistics: ACL 2023, 9797-9815, 2023 | 4 | 2023 |
Low-rank Prompt Interaction for Continual Vision-Language Retrieval W Yan, Y Wang, W Lin, Z Guo, Z Zhao, T Jin Proceedings of the 32nd ACM International Conference on Multimedia, 8257-8266, 2024 | 3 | 2024 |
Rethinking the multimodal correlation of multimodal sequential learning via generalizable attentional results alignment T Jin, W Lin, Y Wang, L Li, X Cheng, Z Zhao Proceedings of the 62nd Annual Meeting of the Association for Computational …, 2024 | 2 | 2024 |
Calibrating Prompt from History for Continual Vision-Language Retrieval and Grounding T Jin, W Yan, Y Wang, S Cai, Q Shuai, Z Zhao Proceedings of the 32nd ACM International Conference on Multimedia, 4302-4311, 2024 | | 2024 |