An overview of cross-media retrieval: Concepts, methodologies, benchmarks, and challenges
Multimedia retrieval plays an indispensable role in big data utilization. Past efforts mainly
focused on single-media retrieval. However, the requirements of users are highly flexible …
focused on single-media retrieval. However, the requirements of users are highly flexible …
Cross-media analysis and reasoning: advances and directions
Cross-media analysis and reasoning is an active research area in computer science, and a
promising direction for artificial intelligence. However, to the best of our knowledge, no …
promising direction for artificial intelligence. However, to the best of our knowledge, no …
Multi-modal factorized bilinear pooling with co-attention learning for visual question answering
Visual question answering (VQA) is challenging because it requires a simultaneous
understanding of both the visual content of images and the textual content of questions. The …
understanding of both the visual content of images and the textual content of questions. The …
[HTML][HTML] Heading toward artificial intelligence 2.0
Y Pan - Engineering, 2016 - Elsevier
With the popularization of the Internet, permeation of sensor networks, emergence of big
data, increase in size of the information community, and interlinking and fusion of data and …
data, increase in size of the information community, and interlinking and fusion of data and …
CM-GANs: Cross-modal generative adversarial networks for common representation learning
It is known that the inconsistent distributions and representations of different modalities, such
as image and text, cause the heterogeneity gap, which makes it very challenging to correlate …
as image and text, cause the heterogeneity gap, which makes it very challenging to correlate …
Cross-modal retrieval with CNN visual features: A new baseline
Recently, convolutional neural network (CNN) visual features have demonstrated their
powerful ability as a universal representation for various recognition tasks. In this paper …
powerful ability as a universal representation for various recognition tasks. In this paper …
Align and tell: Boosting text-video retrieval with local alignment and fine-grained supervision
Text-video retrieval is one of the basic tasks for multimodal research and has been widely
harnessed in many real-world systems. Most existing approaches directly compare the …
harnessed in many real-world systems. Most existing approaches directly compare the …
Inter-media hashing for large-scale retrieval from heterogeneous data sources
In this paper, we present a new multimedia retrieval paradigm to innovate large-scale
search of heterogenous multimedia data. It is able to return results of different media types …
search of heterogenous multimedia data. It is able to return results of different media types …
Robust joint graph sparse coding for unsupervised spectral feature selection
In this paper, we propose a new unsupervised spectral feature selection model by
embedding a graph regularizer into the framework of joint sparse regression for preserving …
embedding a graph regularizer into the framework of joint sparse regression for preserving …
On the role of correlation and abstraction in cross-modal multimedia retrieval
The problem of cross-modal retrieval from multimedia repositories is considered. This
problem addresses the design of retrieval systems that support queries across content …
problem addresses the design of retrieval systems that support queries across content …