Structured two-stream attention network for video question answering L Gao, P Zeng, J Song, YF Li, W Liu, T Mei, HT Shen Proceedings of the AAAI conference on artificial intelligence 33 (01), 6391-6398, 2019 | 79 | 2019 |
From pixels to objects: Cubic visual attention for visual question answering J Song, P Zeng, L Gao, HT Shen IJCAI, 2018 | 79* | 2018 |
Hierarchical representation network with auxiliary tasks for video captioning and video question answering L Gao, Y Lei, P Zeng, J Song, M Wang, HT Shen IEEE Transactions on Image Processing 31, 202-215, 2021 | 77 | 2021 |
Rich visual knowledge-based augmentation network for visual question answering L Zhang, S Liu, D Liu, P Zeng, X Li, J Song, L Gao IEEE Transactions on Neural Networks and Learning Systems 32 (10), 4362-4373, 2020 | 65 | 2020 |
S2 Transformer for Image Captioning. P Zeng, H Zhang, J Song, L Gao IJCAI, 1608-1614, 2022 | 60 | 2022 |
Conceptual and syntactical cross-modal alignment with cross-level consistency for image-text matching P Zeng, L Gao, X Lyu, S Jing, J Song Proceedings of the 29th ACM International Conference on Multimedia, 2205-2213, 2021 | 35 | 2021 |
Video question answering with prior knowledge and object-sensitive learning P Zeng, H Zhang, L Gao, J Song, HT Shen IEEE Transactions on Image Processing 31, 5936-5948, 2022 | 34 | 2022 |
Text-instance graph: Exploring the relational semantics for text-based visual question answering X Li, B Wu, J Song, L Gao, P Zeng, C Gan Pattern Recognition 124, 108455, 2022 | 33 | 2022 |
Examine before you answer: Multi-task learning with adaptive-attentions for multiple-choice VQA L Gao, P Zeng, J Song, X Liu, HT Shen Proceedings of the 26th ACM international conference on Multimedia, 1742-1750, 2018 | 33 | 2018 |
Memory-based augmentation network for video captioning S Jing, H Zhang, P Zeng, L Gao, J Song, HT Shen IEEE Transactions on Multimedia, 2023 | 27 | 2023 |
Complementarity-aware space learning for video-text retrieval J Zhu, P Zeng, L Gao, G Li, D Liao, J Song IEEE Transactions on Circuits and Systems for Video Technology 33 (8), 4362-4374, 2023 | 26 | 2023 |
Progressive tree-structured prototype network for end-to-end image captioning P Zeng, J Zhu, J Song, L Gao Proceedings of the 30th ACM International Conference on Multimedia, 5210-5218, 2022 | 24 | 2022 |
Learning visual question answering on controlled semantic noisy labels H Zhang, P Zeng, Y Hu, J Qian, J Song, L Gao Pattern Recognition 138, 109339, 2023 | 23 | 2023 |
Adaptive fine-grained predicates learning for scene graph generation X Lyu, L Gao, P Zeng, HT Shen, J Song IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023 | 18 | 2023 |
Dynamic scene graph generation via temporal prior inference S Wang, L Gao, X Lyu, Y Guo, P Zeng, J Song Proceedings of the 30th ACM International Conference on Multimedia, 5793-5801, 2022 | 18 | 2022 |
A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval H Li, J Song, L Gao, P Zeng, H Zhang, G Li Advances in Neural Information Processing Systems, 2022 | 17 | 2022 |
Visual commonsense-aware representation network for video captioning P Zeng, H Zhang, L Gao, X Li, J Qian, HT Shen IEEE Transactions on Neural Networks and Learning Systems, 2023 | 16 | 2023 |
Dual-branch hybrid learning network for unbiased scene graph generation C Zheng, L Gao, X Lyu, P Zeng, AE Saddik, HT Shen IEEE Transactions on Circuits and Systems for Video Technology, 2023 | 15 | 2023 |
SPT: Spatial pyramid transformer for image captioning H Zhang, P Zeng, L Gao, X Lyu, J Song, HT Shen IEEE Transactions on Circuits and Systems for Video Technology, 2023 | 14 | 2023 |
ProS: Prompting-to-simulate Generalized knowledge for Universal Cross-Domain Retrieval K Fang, J Song, L Gao, P Zeng, ZQ Cheng, X Li, HT Shen Proceedings of the IEEE/CVF international conference on computer vision, 2024 | 12 | 2024 |