mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections C Li, H Xu, J Tian, W Wang, M Yan, B Bi, J Ye, H Chen, G Xu, Z Cao, ... arXiv preprint arXiv:2205.12005, 2022 | 133 | 2022 |
Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding J Ye, J Tian, M Yan, X Yang, X Wang, J Zhang, L He, X Lin Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2022 | 76 | 2022 |
One-stage visual grounding via semantic-aware feature filter J Ye, X Lin, L He, D Li, Q Chen Proceedings of the 29th ACM International Conference on Multimedia, 1702-1711, 2021 | 32 | 2021 |
Inferring substitutable and complementary products with Knowledge-Aware Path Reasoning based on dynamic policy network Z Yang, J Ye, L Wang, X Lin, L He Knowledge-Based Systems 235, 107579, 2022 | 22 | 2022 |
: Prompt-Based Entity-Related Visual Clue Extraction and Integration for Multimodal Named Entity Recognition X Wang, J Tian, M Gui, Z Li, J Ye, M Yan, Y Xiao International Conference on Database Systems for Advanced Applications, 297-305, 2022 | 20 | 2022 |