- Academic Search

J Wei, Z Li, C Zhang, H Ma - Neural Networks, 2024 - Elsevier

Recently, exciting progress has been made in the research of supervised image captioning.
However, manually annotated image-annotation pair data is difficult and expensive to …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] aaai.org

Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning

Z Liu, J Liu, F Ma - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org

Although image captioning models have made significant advancements in recent years, the
majority of them heavily depend on high-quality datasets containing paired images and texts …

Speichern Zitieren Zitiert von: 7 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] openreview.net

Cross-Modal Coherence-Enhanced Feedback Prompting for News Captioning

N Xu, Y Gao, TT Zhang, H Tian, AA Liu - Proceedings of the 32nd ACM …, 2024 - dl.acm.org

News Captioning involves generating the descriptions for news images based on the
detailed content of related news articles. Given that these articles often contain extensive …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] arxiv.org

Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning

J Luo, J Chen, Y Li, Y Pan, J Feng, H Chao… - European Conference on …, 2024 - Springer

Recently, zero-shot image captioning has gained increasing attention, where only text data
is available for training. The remarkable progress in text-to-image diffusion model presents …

Speichern Zitieren Ähnliche Artikel Alle 8 Versionen

Enhancing Image Captioning Using Deep Convolutional Generative Adversarial Networks

T Jaiswal, M Pandey, P Tripathi - Recent Advances in …, 2024 - ingentaconnect.com

Introduction: Image caption generation has long been a fundamental challenge in the area
of computer vision (CV) and natural language processing (NLP). In this research, we present …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel

[Free GPT-4]

[PDF] acm.org

CVLP-NaVD: Contrastive Visual-Language Pre-training Models for Non-annotated Visual Description

H Li, Y Hao, J Yu, B Zhu, S Wang, T Xu - ACM Transactions on …, 2024 - dl.acm.org

Non-annotated visual description (NaVD) aims to describe generic visuals without human-
annotated pairwise data. The generic visuals refer to images and videos. Existing works …

Speichern Zitieren Ähnliche Artikel

Pseudo Content Hallucination for Unpaired Image Captioning

H Ben, S Wang, M Wang, R Hong - Proceedings of the 2024 …, 2024 - dl.acm.org

Unpaired Image Captioning (UIC) is designed to describe an image without relying on
matched vision-language training data. It is a challenging task since (1) the implicit and …

Speichern Zitieren Zitiert von: 2 Ähnliche Artikel Alle 2 Versionen

Dynamic text prompt joint multimodal features for accurate plant disease image captioning

F Liang, Z Huang, W Wang, Z He, Q En - The Visual Computer, 2024 - Springer

Plant disease captioning is crucial for agricultural pest and disease prevention. However,
generating accurate captions for plant disease images remains challenging because of the …

Speichern Zitieren Ähnliche Artikel

[Free GPT-4]

[PDF] acm.org

Exploring annotation-free image captioning with retrieval-augmented pseudo sentence generation

Z Li, D Liu, H Wang, C Zhang, W Cai - Proceedings of the 6th ACM …, 2024 - dl.acm.org

Recently, training an image captioner without annotated image-sentence pairs has gained
traction. Previous methods have faced limitations due to either using mismatched corpora for …

Speichern Zitieren Zitiert von: 1 Ähnliche Artikel Alle 3 Versionen

Can Language Improve Visual Features For Distinguishing Unseen Plant Diseases?

JZ Liaw, AYH Chai, SH Lee, P Bonnet… - … Conference on Pattern …, 2025 - Springer

Deep learning approaches have been pivotal in identifying multi-plant diseases, yet they
often struggle with unseen data. The challenge of handling unseen data is significant due to …

Speichern Zitieren Ähnliche Artikel Alle 3 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

CgT-GAN: CLIP-guided Text GAN for Image Captioning

Mining core information by evaluating semantic importance for unpaired image captioning

Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning

Cross-Modal Coherence-Enhanced Feedback Prompting for News Captioning

Unleashing Text-to-Image Diffusion Prior for Zero-Shot Image Captioning

Enhancing Image Captioning Using Deep Convolutional Generative Adversarial Networks

CVLP-NaVD: Contrastive Visual-Language Pre-training Models for Non-annotated Visual Description

Pseudo Content Hallucination for Unpaired Image Captioning

Dynamic text prompt joint multimodal features for accurate plant disease image captioning

Exploring annotation-free image captioning with retrieval-augmented pseudo sentence generation

Can Language Improve Visual Features For Distinguishing Unseen Plant Diseases?