- Academic Search

W Chai, G Wang - Applied Sciences, 2022 - mdpi.com

Deep vision multimodal learning aims at combining deep visual representation learning with
other modalities, such as text, sound, and data collected from other sensors. With the fast …

Enregistrer Citer Cité 32 fois Autres articles Les 4 versions Free GPT-4 En cache

[Free GPT-4]

[PDF] wiley.com Full View

A thorough review of models, evaluation metrics, and datasets on image captioning

G Luo, L Cheng, C **g, C Zhao… - IET Image Processing, 2022 - Wiley Online Library

Image captioning means generate descriptive sentences from a query image automatically.
It has recently received widespread attention from the computer vision and natural language …

Enregistrer Citer Cité 25 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Exploring group video captioning with efficient relational approximation

W Lin, T **, Y Wang, W Pan, L Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Current video captioning efforts most focus on describing a single video while the need for
captioning videos in groups has increased considerably. In this study, we propose a new …

Enregistrer Citer Cité 10 fois Autres articles Les 3 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Rethinking the reference-based distinctive image captioning

Y Mao, L Chen, Z Jiang, D Zhang, Z Zhang… - Proceedings of the 30th …, 2022 - dl.acm.org

Distinctive Image Captioning (DIC)---generating distinctive captions that describe the unique
details of a target image---has received considerable attention over the last few years. A …

Enregistrer Citer Cité 25 fois Autres articles Les 3 versions Free GPT-4

Progressive tree-structured prototype network for end-to-end image captioning

P Zeng, J Zhu, J Song, L Gao - … of the 30th ACM International Conference …, 2022 - dl.acm.org

Studies of image captioning are shifting towards a trend of a fully end-to-end paradigm by
leveraging powerful visual pre-trained models and transformer-based generation …

Enregistrer Citer Cité 24 fois Autres articles Les 3 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Switching to discriminative image captioning by relieving a bottleneck of reinforcement learning

U Honda, T Watanabe… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Discriminativeness is a desirable feature of image captions: captions should describe the
characteristic details of input images. However, recent high-performing captioning models …

Enregistrer Citer Cité 13 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Improving reference-based distinctive image captioning with contrastive rewards

Y Mao, J **ao, D Zhang, M Cao, J Shao… - ACM Transactions on …, 2024 - dl.acm.org

Distinctive Image Captioning (DIC)—generating distinctive captions that describe the unique
details of a target image—has received considerable attention over the last few years. A …

Enregistrer Citer Cité 9 fois Autres articles Les 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Distinctive image captioning via clip guided group optimization

Y Zhang, J Wang, H Wu, W Xu - European Conference on Computer …, 2022 - Springer

Image captioning models are usually trained according to human annotated ground-truth
captions, which could generate accurate but generic captions. In this paper, we focus on …

Enregistrer Citer Cité 8 fois Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Learning descriptive image captioning via semipermeable maximum likelihood estimation

Z Yue, A Hu, L Zhang, Q ** - Advances in Neural …, 2023 - proceedings.neurips.cc

Image captioning aims to describe visual content in natural language. As'a picture is worth a
thousand words', there could be various correct descriptions for an image. However, with …

Enregistrer Citer Cité 3 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Pragmatic inference with a CLIP listener for contrastive captioning

J Ou, B Krojer, D Fried - arxiv preprint arxiv:2306.08818, 2023 - arxiv.org

We propose a simple yet effective and robust method for contrastive captioning: generating
discriminative captions that distinguish target images from very similar alternative distractor …

Enregistrer Citer Cité 6 fois Autres articles Les 3 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Group-based distinctive image captioning with memory attention

Deep vision multimodal learning: Methodology, benchmark, and trend

A thorough review of models, evaluation metrics, and datasets on image captioning

Exploring group video captioning with efficient relational approximation

Rethinking the reference-based distinctive image captioning

Progressive tree-structured prototype network for end-to-end image captioning

Switching to discriminative image captioning by relieving a bottleneck of reinforcement learning

Improving reference-based distinctive image captioning with contrastive rewards

Distinctive image captioning via clip guided group optimization

Learning descriptive image captioning via semipermeable maximum likelihood estimation

Pragmatic inference with a CLIP listener for contrastive captioning