Et-bert: A contextualized datagram representation with pre-training transformers for encrypted traffic classification X Lin, G Xiong, G Gou, Z Li, J Shi, J Yu Proceedings of the ACM Web Conference 2022, 633-642, 2022 | 242 | 2022 |
Feature integration analysis of bag-of-features model for image retrieval J Yu, Z Qin, T Wan, X Zhang Neurocomputing 120, 355-364, 2013 | 206 | 2013 |
Mucko: Multi-layer cross-modal knowledge reasoning for fact-based visual question answering Z Zhu, J Yu, Y Wang, Y Sun, Y Hu, Q Wu arXiv preprint arXiv:2006.09073, 2020 | 151 | 2020 |
A SIFT-LBP image retrieval model based on bag of features X Yuan, J Yu, Z Qin, T Wan IEEE international conference on image processing, 1061-1064, 2011 | 134 | 2011 |
Cogtree: Cognition tree loss for unbiased scene graph generation J Yu, Y Chai, Y Wang, Y Hu, Q Wu arXiv preprint arXiv:2009.07526, 2020 | 128 | 2020 |
Mukea: Multimodal knowledge extraction and accumulation for knowledge-based visual question answering Y Ding, J Yu, B Liu, Y Hu, M Cui, Q Wu Proceedings of the IEEE/CVF conference on computer vision and pattern …, 2022 | 127 | 2022 |
Syntax-BERT: Improving pre-trained transformers with syntax trees YT Jiangang Bai, Yujing Wang, Yiren Chen, Yaming Yang, Jing Bai, Jing Yu The Conference of the European Chapter of the Association for Computational …, 2021 | 122* | 2021 |
Cross-modal knowledge reasoning for knowledge-based visual question answering J Yu, Z Zhu, Y Wang, W Zhang, Y Hu, J Tan Pattern Recognition 108, 107563, 2020 | 113 | 2020 |
Reasoning on the relation: Enhancing visual representation for visual question answering and cross-modal retrieval J Yu, W Zhang, Y Lu, Z Qin, Y Hu, J Tan, Q Wu IEEE Transactions on Multimedia 22 (12), 3196-3209, 2020 | 93 | 2020 |
DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue. X Jiang, J Yu, Z Qin, Y Zhuang, X Zhang, Y Hu, Q Wu AAAI 1 (3), 5, 2020 | 77 | 2020 |
Multimodal feature fusion by relational reasoning and attention for visual question answering W Zhang, J Yu, H Hu, H Hu, Z Qin Information Fusion 55, 116-126, 2020 | 75 | 2020 |
Modeling text with graph convolutional network for cross-modal information retrieval J Yu, Y Lu, Z Qin, W Zhang, Y Liu, J Tan, L Guo Advances in Multimedia Information Processing–PCM 2018: 19th Pacific-Rim …, 2018 | 56 | 2018 |
DMRFNet: deep multimodal reasoning and fusion for visual question answering and explanation generation W Zhang, J Yu, W Zhao, C Ran Information Fusion 72, 70-79, 2021 | 54 | 2021 |
Evolving attention with residual convolutions Y Wang, Y Yang, J Bai, M Zhang, J Bai, J Yu, C Zhang, G Huang, Y Tong International Conference on Machine Learning, 10971-10980, 2021 | 40 | 2021 |
KBGN: Knowledge-bridge graph network for adaptive vision-text reasoning in visual dialogue X Jiang, S Du, Z Qin, Y Sun, J Yu Proceedings of the 28th ACM international conference on multimedia, 1265-1273, 2020 | 38 | 2020 |
Scene graph reasoning with prior visual relationship for visual question answering Z Yang, Z Qin, J Yu, Y Hu arXiv preprint arXiv:1812.09681, 2018 | 34 | 2018 |
Multimodal deep fusion for image question answering W Zhang, J Yu, Y Wang, W Wang Knowledge-Based Systems 212, 106639, 2021 | 33 | 2021 |
Cross-modal topic correlations for multimedia retrieval J Yu, Y Cong, Z Qin, T Wan Proceedings of the 21st International Conference on Pattern Recognition …, 2012 | 32 | 2012 |
Context-I2W: mapping images to context-dependent words for accurate zero-shot composed image retrieval Y Tang, J Yu, K Gai, J Zhuang, G Xiong, Y Hu, Q Wu Proceedings of the AAAI Conference on Artificial Intelligence 38 (6), 5180-5188, 2024 | 30 | 2024 |
Learning dual encoding model for adaptive visual understanding in visual dialogue J Yu, X Jiang, Z Qin, W Zhang, Y Hu, Q Wu IEEE Transactions on Image Processing 30, 220-233, 2020 | 29 | 2020 |