- Academic Search

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

Enregistrer Citer Cité 396 fois Autres articles Les 11 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A comprehensive survey of deep learning for image captioning

MDZ Hossain, F Sohel, MF Shiratuddin… - ACM Computing Surveys …, 2019 - dl.acm.org

Generating a description of an image is called image captioning. Image captioning requires
recognizing the important objects, their attributes, and their relationships in an image. It also …

Enregistrer Citer Cité 1008 fois Autres articles Les 8 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Visual attention methods in deep learning: An in-depth survey

M Hassanin, S Anwar, I Radwan, FS Khan, A Mian - Information Fusion, 2024 - Elsevier

Inspired by the human cognitive system, attention is a mechanism that imitates the human
cognitive awareness about specific information, amplifying critical details to focus more on …

Enregistrer Citer Cité 161 fois Autres articles Les 7 versions Free GPT-4

[Free GPT-4]

[PDF] wiley.com Full View

An overview of image caption generation methods

H Wang, Y Zhang, X Yu - Computational intelligence and …, 2020 - Wiley Online Library

In recent years, with the rapid development of artificial intelligence, image caption has
gradually attracted the attention of many researchers in the field of artificial intelligence and …

Enregistrer Citer Cité 140 fois Autres articles Les 12 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Capsal: Leveraging captioning to boost semantics for salient object detection

L Zhang, J Zhang, Z Lin, H Lu… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Detecting salient objects in cluttered scenes is a big challenge. To address this problem, we
argue that the model needs to learn discriminative semantic features for salient objects. To …

Enregistrer Citer Cité 135 fois Autres articles Les 4 versions Free GPT-4 Version HTML

[Free GPT-4]

[HTML] nih.gov

[HTML][HTML] Human gaze assisted artificial intelligence: A review

R Zhang, A Saran, B Liu, Y Zhu, S Guo… - IJCAI: Proceedings of …, 2020 - ncbi.nlm.nih.gov

Human gaze reveals a wealth of information about internal cognitive state. Thus, gaze-
related research has significantly increased in computer vision, natural language …

Enregistrer Citer Cité 77 fois Autres articles Les 7 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Model-agnostic gender debiased image captioning

Y Hirota, Y Nakashima… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Image captioning models are known to perpetuate and amplify harmful societal bias in the
training set. In this work, we aim to mitigate such gender bias in image captioning models …

Enregistrer Citer Cité 17 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Paying more attention to saliency: Image captioning with saliency and context attention

M Cornia, L Baraldi, G Serra, R Cucchiara - ACM Transactions on …, 2018 - dl.acm.org

Image captioning has been recently gaining a lot of attention thanks to the impressive
achievements shown by deep captioning architectures, which combine Convolutional …

Enregistrer Citer Cité 125 fois Autres articles Les 9 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Predicting human scanpaths in visual question answering

X Chen, M Jiang, Q Zhao - … of the IEEE/CVF Conference on …, 2021 - openaccess.thecvf.com

Attention has been an important mechanism for both humans and computer vision systems.
While state-of-the-art models to predict attention focus on estimating a static probabilistic …

Enregistrer Citer Cité 53 fois Autres articles Les 8 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Gazexplain: Learning to predict natural language explanations of visual scanpaths

X Chen, M Jiang, Q Zhao - European Conference on Computer Vision, 2024 - Springer

While exploring visual scenes, humans' scanpaths are driven by their underlying attention
processes. Understanding visual scanpaths is essential for various applications. Traditional …

Enregistrer Citer Cité 3 fois Autres articles Les 9 versions Free GPT-4

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

From show to tell: A survey on deep learning-based image captioning

A comprehensive survey of deep learning for image captioning

Visual attention methods in deep learning: An in-depth survey

An overview of image caption generation methods

Capsal: Leveraging captioning to boost semantics for salient object detection

[HTML][HTML] Human gaze assisted artificial intelligence: A review

Model-agnostic gender debiased image captioning

Paying more attention to saliency: Image captioning with saliency and context attention

Predicting human scanpaths in visual question answering

Gazexplain: Learning to predict natural language explanations of visual scanpaths