- Academic Search

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org

Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

保存引用被引用次数：396 相关文章所有 11 个版本

[Free GPT-4]

[PDF] arxiv.org

A comprehensive survey of deep learning for image captioning

MDZ Hossain, F Sohel, MF Shiratuddin… - ACM Computing Surveys …, 2019 - dl.acm.org

Generating a description of an image is called image captioning. Image captioning requires
recognizing the important objects, their attributes, and their relationships in an image. It also …

保存引用被引用次数：1008 相关文章所有 8 个版本

[Free GPT-4]

[PDF] thecvf.com

Zerocap: Zero-shot image-to-text generation for visual-semantic arithmetic

Y Tewel, Y Shalev, I Schwartz… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Recent text-to-image matching models apply contrastive learning to large corpora of
uncurated pairs of images and sentences. While such models can provide a powerful score …

保存引用被引用次数：168 相关文章所有 6 个版本 HTML 版

A survey of zero-shot learning: Settings, methods, and applications

W Wang, VW Zheng, H Yu, C Miao - ACM Transactions on Intelligent …, 2019 - dl.acm.org

Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …

保存引用被引用次数：777 相关文章所有 2 个版本

[Free GPT-4]

[PDF] thecvf.com

Visualgpt: Data-efficient adaptation of pretrained language models for image captioning

J Chen, H Guo, K Yi, B Li… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

The limited availability of annotated data often hinders real-world applications of machine
learning. To efficiently learn from small quantities of multimodal data, we leverage the …

保存引用被引用次数：243 相关文章所有 12 个版本 HTML 版

[Free GPT-4]

[PDF] researchgate.net

Artificial intelligence (AI) in augmented reality (AR)-assisted manufacturing applications: a review

CK Sahu, C Young, R Rai - International Journal of Production …, 2021 - Taylor & Francis

Augmented reality (AR) has proven to be an invaluable interactive medium to reduce
cognitive load by bridging the gap between the task-at-hand and relevant information by …

保存引用被引用次数：231 相关文章所有 6 个版本

[Free GPT-4]

[PDF] arxiv.org

Beyond IID: three levels of generalization for question answering on knowledge bases

Y Gu, S Kase, M Vanni, B Sadler, P Liang… - Proceedings of the Web …, 2021 - dl.acm.org

Existing studies on question answering on knowledge bases (KBQA) mainly operate with
the standard iid assumption, ie, training distribution over questions is the same as the test …

保存引用被引用次数：232 相关文章所有 6 个版本

[Free GPT-4]

[PDF] thecvf.com

Neural baby talk

J Lu, J Yang, D Batra, D Parikh - Proceedings of the IEEE …, 2018 - openaccess.thecvf.com

We introduce a novel framework for image captioning that can produce natural language
explicitly grounded in entities that object detectors find in the image. Our approach …

保存引用被引用次数：589 相关文章所有 9 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Improved image captioning via policy gradient optimization of spider

S Liu, Z Zhu, N Ye, S Guadarrama… - Proceedings of the …, 2017 - openaccess.thecvf.com

Current image captioning methods are usually trained via maximum likelihood estimation.
However, the log-likelihood score of a caption does not correlate well with human …

保存引用被引用次数：558 相关文章所有 6 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Nocaps: Novel object captioning at scale

H Agrawal, K Desai, Y Wang, X Chen… - Proceedings of the …, 2019 - openaccess.thecvf.com

Image captioning models have achieved impressive results on datasets containing limited
visual concepts and large amounts of paired image-caption training data. However, if these …

保存引用被引用次数：359 相关文章所有 11 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Captioning images with diverse objects

From show to tell: A survey on deep learning-based image captioning

A comprehensive survey of deep learning for image captioning

Zerocap: Zero-shot image-to-text generation for visual-semantic arithmetic

A survey of zero-shot learning: Settings, methods, and applications

Visualgpt: Data-efficient adaptation of pretrained language models for image captioning

Artificial intelligence (AI) in augmented reality (AR)-assisted manufacturing applications: a review

Beyond IID: three levels of generalization for question answering on knowledge bases

Neural baby talk

Improved image captioning via policy gradient optimization of spider

Nocaps: Novel object captioning at scale