A survey on multimodal large language models S Yin, C Fu, S Zhao, K Li, X Sun, T Xu, E Chen arXiv preprint arXiv:2306.13549, 2023 | 1061 | 2023 |
Woodpecker: Hallucination correction for multimodal large language models S Yin, C Fu, S Zhao, T Xu, H Wang, D Sui, Y Shen, K Li, X Sun, E Chen Science China Information Sciences 67 (12), 220105, 2024 | 161 | 2024 |
AU-aware graph convolutional network for Macro and Micro-expression spotting S Yin, S Wu, T Xu, S Liu, S Zhao, E Chen 2023 IEEE International Conference on Multimedia and Expo (ICME), 228-233, 2023 | 11 | 2023 |
Fine-grained micro-expression generation based on thin-plate spline and relative au constraint S Zhao, S Yin, H Tang, R Jin, Y Xu, T Xu, E Chen Proceedings of the 30th ACM International Conference on Multimedia, 7150-7154, 2022 | 9 | 2022 |
Mme-survey: A comprehensive survey on evaluation of multimodal llms C Fu, YF Zhang, S Yin, B Li, X Fang, S Zhao, H Duan, X Sun, Z Liu, ... arXiv preprint arXiv:2411.15296, 2024 | 5 | 2024 |
T2Vid: Translating Long Text into Multi-Image is the Catalyst for Video-LLMs S Yin, C Fu, S Zhao, Y Shen, C Ge, Y Yang, Z Long, Y Dai, T Xu, X Sun, ... arXiv preprint arXiv:2411.19951, 2024 | 1 | 2024 |
I-AM-G: Interest Augmented Multimodal Generator for Item Personalization X Wang, L Wu, S Yin, Z Li, Y Chen, H Hufeng, Y Su, Q Liu Proceedings of the 2024 Conference on Empirical Methods in Natural Language …, 2024 | | 2024 |
Exploiting Instance-level Relationships in Weakly Supervised Text-to-Video Retrieval S Yin, S Zhao, H Wang, T Xu, E Chen ACM Transactions on Multimedia Computing, Communications and Applications, 2024 | | 2024 |