Scene recognition with cnns: objects, scales and dataset bias L Herranz, S Jiang, X Li Proceedings of the IEEE conference on computer vision and pattern …, 2016 | 249 | 2016 |
Know more say less: Image captioning based on scene graphs X Li, S Jiang IEEE Transactions on Multimedia 21 (8), 2117-2130, 2019 | 197 | 2019 |
Learning object context for dense captioning X Li, S Jiang, J Han Proceedings of the AAAI conference on artificial intelligence 33 (01), 8650-8657, 2019 | 65 | 2019 |
Gridmm: Grid memory map for vision-and-language navigation Z Wang, X Li, J Yang, Y Liu, S Jiang Proceedings of the IEEE/CVF International Conference on Computer Vision …, 2023 | 48 | 2023 |
Kerm: Knowledge enhanced reasoning for vision-and-language navigation X Li, Z Wang, J Yang, Y Wang, S Jiang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2023 | 39 | 2023 |
Visual relationship detection with object spatial distribution Y Zhu, S Jiang, X Li 2017 IEEE International Conference on Multimedia and Expo (ICME), 379-384, 2017 | 37 | 2017 |
Image captioning with both object and scene information X Li, X Song, L Herranz, Y Zhu, S Jiang Proceedings of the 24th ACM international conference on Multimedia, 1107-1110, 2016 | 29 | 2016 |
Bundled Object Context for Referring Expressions X Li, S Jiang IEEE Transactions on Multimedia, 2018 | 26 | 2018 |
Class agnostic image common object detection S Jiang, S Liang, C Chen, Y Zhu, X Li IEEE Transactions on Image Processing 28 (6), 2836-2846, 2019 | 22 | 2019 |
Dataset bias in few-shot image recognition S Jiang, Y Zhu, C Liu, X Song, X Li, W Min IEEE transactions on pattern analysis and machine intelligence 45 (1), 229-246, 2022 | 21 | 2022 |
ISIA at the ImageCLEF 2017 Image Caption Task. S Liang, X Li, Y Zhu, X Li, S Jiang CLEF (working notes), 2017 | 18 | 2017 |
Where and what to eat: Simultaneous restaurant and dish recognition from food image H Wang, W Min, X Li, S Jiang Advances in Multimedia Information Processing-PCM 2016: 17th Pacific-Rim …, 2016 | 17 | 2016 |
The retrieval of shoeprint images based on the integral histogram of the gabor transform domain X Li, M Wu, Z Shi Intelligent Information Processing VII: 8th IFIP TC 12 International …, 2014 | 16 | 2014 |
Modality-specific and hierarchical feature learning for RGB-D hand-held object recognition X Lv, X Liu, X Li, X Li, S Jiang, Z He Multimedia Tools and Applications 76, 4273-4290, 2017 | 12 | 2017 |
Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation Z Wang, X Li, J Yang, Y Liu, J Hu, M Jiang, S Jiang Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024 | 10 | 2024 |
Multifaceted analysis of fine-tuning in a deep model for visual recognition X Li, L Herranz, S Jiang ACM Transactions on Data Science 1 (1), 1-22, 2020 | 8 | 2020 |
Joint Learning of CNN and LSTM for Image Captioning. Y Zhu, X Li, X Li, J Sun, X Song, S Jiang CLEF (Working Notes), 421-427, 2016 | 7 | 2016 |
Sim-to-real transfer via 3d feature fields for vision-and-language navigation Z Wang, X Li, J Yang, Y Liu, S Jiang arXiv preprint arXiv:2406.09798, 2024 | 6 | 2024 |
Membridge: Video-language pre-training with memory-augmented inter-modality bridge J Yang, X Li, M Zheng, Z Wang, Y Zhu, X Guo, Y Yuan, Z Chai, S Jiang IEEE Transactions on Image Processing, 2023 | 6 | 2023 |
Heterogeneous convolutional neural networks for visual recognition X Li, L Herranz, S Jiang Advances in Multimedia Information Processing-PCM 2016: 17th Pacific-Rim …, 2016 | 5 | 2016 |