Towards robust pattern recognition: A review

XY Zhang, CL Liu, CY Suen - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org
The accuracies for many pattern recognition tasks have increased rapidly year by year,
achieving or even outperforming human performance. From the perspective of accuracy …

Cross-modal retrieval: a systematic review of methods and future directions

T Wang, F Li, L Zhu, J Li, Z Zhang… - Proceedings of the …, 2025 - ieeexplore.ieee.org
With the exponential surge in diverse multimodal data, traditional unimodal retrieval
methods struggle to meet the needs of users seeking access to data across various …

Graph neural networks: foundation, frontiers and applications

L Wu, P Cui, J Pei, L Zhao, X Guo - … of the 28th ACM SIGKDD conference …, 2022 - dl.acm.org
The field of graph neural networks (GNNs) has seen rapid and incredible strides over the
recent years. Graph neural networks, also known as deep learning on graphs, graph …

Negative-aware attention framework for image-text matching

K Zhang, Z Mao, Q Wang… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Image-text matching, as a fundamental task, bridges the gap between vision and language.
The key of this task is to accurately measure similarity between these two modalities. Prior …

Similarity reasoning and filtration for image-text matching

H Diao, Y Zhang, L Ma, H Lu - Proceedings of the AAAI conference on …, 2021 - ojs.aaai.org
Image-text matching plays a critical role in bridging the vision and language, and great
progress has been made by exploiting the global alignment between image and sentence …

Multi-modality cross attention network for image and sentence matching

X Wei, T Zhang, Y Li, Y Zhang… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com
The key of image and sentence matching is to accurately measure the visual-semantic
similarity between an image and a sentence. However, most existing methods make use of …

Visual semantic reasoning for image-text matching

K Li, Y Zhang, K Li, Y Li, Y Fu - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Image-text matching has been a hot research topic bridging the vision and language areas.
It remains challenging because the current representation of image usually lacks global …

Cross-modality person re-identification with shared-specific feature transfer

Y Lu, Y Wu, B Liu, T Zhang, B Li… - Proceedings of the …, 2020 - openaccess.thecvf.com
Cross-modality person re-identification (cm-ReID) is a challenging but key technology for
intelligent video analysis. Existing works mainly focus on learning modality-shared …

Multimodal contrastive training for visual representation learning

X Yuan, Z Lin, J Kuen, J Zhang… - Proceedings of the …, 2021 - openaccess.thecvf.com
We develop an approach to learning visual representations that embraces multimodal data,
driven by a combination of intra-and inter-modal similarity preservation objectives. Unlike …

Fine-grained video-text retrieval with hierarchical graph reasoning

S Chen, Y Zhao, Q **, Q Wu - Proceedings of the IEEE/CVF …, 2020 - openaccess.thecvf.com
Cross-modal retrieval between videos and texts has attracted growing attentions due to the
rapid emergence of videos on the web. The current dominant approach is to learn a joint …