From show to tell: A survey on deep learning-based image captioning

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

Supervised learning of semantic classes for image annotation and retrieval

G Carneiro, AB Chan, PJ Moreno… - IEEE transactions on …, 2007 - ieeexplore.ieee.org
A probabilistic formulation for semantic image annotation and retrieval is proposed.
Annotation and retrieval are posed as classification problems where each class is defined …

On social networks and collaborative recommendation

I Konstas, V Stathopoulos, JM Jose - Proceedings of the 32nd …, 2009 - dl.acm.org
Social network systems, like last. fm, play a significant role in Web 2.0, containing large
amounts of multimedia-enriched data that are enhanced both by explicit user-provided …

Tag ranking

D Liu, XS Hua, L Yang, M Wang, HJ Zhang - Proceedings of the 18th …, 2009 - dl.acm.org
Social media sharing web sites like Flickr allow users to annotate images with free tags,
which significantly facilitate Web image search and organization. However, the tags …

Semantic interdisciplinary evaluation of image captioning models

U Sirisha, B Sai Chandana - Cogent Engineering, 2022 - Taylor & Francis
In our day-to-day life, synchronizing vision and language aspects plays a crucial role in
solving various real-time challenges. Image captioning is one of them, and it aims to …

Annosearch: Image auto-annotation by search

XJ Wang, L Zhang, F **g… - 2006 IEEE Computer …, 2006 - ieeexplore.ieee.org
Although it has been studied for several years by computer vision and machine learning
communities, image annotation is still far from practical. In this paper, we present …

Show, edit and tell: a framework for editing image captions

F Sammani, L Melas-Kyriazi - Proceedings of the IEEE/CVF …, 2020 - openaccess.thecvf.com
Most image captioning frameworks generate captions directly from images, learning a
map** from visual features to natural language. However, editing existing captions can be …

Video search reranking through random walk over document-level context graph

WH Hsu, LS Kennedy, SF Chang - Proceedings of the 15th ACM …, 2007 - dl.acm.org
Multimedia search over distributed sources often result in recurrent images or videos which
are manifested beyond the textual modality. To exploit such contextual patterns and keep …

Unifying guilt-by-association approaches: Theorems and fast algorithms

D Koutra, TY Ke, U Kang, DH Chau, HKK Pao… - Machine Learning and …, 2011 - Springer
If several friends of Smith have committed petty thefts, what would you say about Smith?
Most people would not be surprised if Smith is a hardened criminal. Guilt-by-association …

Annotating images by mining image search results

XJ Wang, L Zhang, X Li, WY Ma - IEEE Transactions on Pattern …, 2008 - ieeexplore.ieee.org
In this paper, we propose a novel attempt of model-free image annotation which annotates
images by mining their search results. It contains three steps: 1) the search process to …