Socializing the semantic gap: A comparative survey on image tag assignment, refinement, and retrieval
Where previous reviews on content-based image retrieval emphasize what can be seen in
an image to bridge the semantic gap, this survey considers what people tag about an image …
an image to bridge the semantic gap, this survey considers what people tag about an image …
Image retrieval on real-life images with pre-trained vision-and-language models
Z Liu, C Rodriguez-Opazo… - Proceedings of the …, 2021 - openaccess.thecvf.com
We extend the task of composed image retrieval, where an input query consists of an image
and short textual description of how to modify the image. Existing methods have only been …
and short textual description of how to modify the image. Existing methods have only been …
Deep multiple instance learning for image classification and auto-annotation
The recent development in learning deep representations has demonstrated its wide
applications in traditional vision tasks like classification and detection. However, there has …
applications in traditional vision tasks like classification and detection. However, there has …
Learning with augmented features for supervised and semi-supervised heterogeneous domain adaptation
In this paper, we study the heterogeneous domain adaptation (HDA) problem, in which the
data from the source domain and the target domain are represented by heterogeneous …
data from the source domain and the target domain are represented by heterogeneous …
Taxonomy, state-of-the-art, challenges and applications of visual understanding: A review
NY Khanday, SA Sofi - Computer Science Review, 2021 - Elsevier
Since the dawn of Humanity, to communicate both abstract and concrete ideas, visualization
through visual imagery has been an effective way. With the advancement of scientific …
through visual imagery has been an effective way. With the advancement of scientific …
Learning multi-level deep representations for image emotion classification
In this paper, we propose a new deep network that learns multi-level deep representations
for image emotion classification (MldrNet). Image emotion can be recognized through image …
for image emotion classification (MldrNet). Image emotion can be recognized through image …
Flexattention for efficient high-resolution vision-language models
Current high-resolution vision-language models encode images as high-resolution image
tokens and exhaustively take all these tokens to compute attention, which significantly …
tokens and exhaustively take all these tokens to compute attention, which significantly …
Discriminative multi-instance multitask learning for 3D action recognition
As the prosperity of low-cost and easy-operating depth cameras, skeleton-based human
action recognition has been extensively studied recently. However, most of the existing …
action recognition has been extensively studied recently. However, most of the existing …
Bi-directional training for composed image retrieval via text prompt learning
Composed image retrieval searches for a target image based on a multi-modal user query
comprised of a reference image and modification text describing the desired changes …
comprised of a reference image and modification text describing the desired changes …
Towards automatic construction of diverse, high-quality image datasets
The availability of labeled image datasets has been shown critical for high-level image
understanding, which continuously drives the progress of feature designing and models …
understanding, which continuously drives the progress of feature designing and models …