- Academic Search

Y Ming, N Hu, C Fan, F Feng… - IEEE/CAA Journal of …, 2022 - researchportal.port.ac.uk

Image captioning refers to automatic generation of descriptive texts according to the visual
content of images. It is a technique integrating multiple disciplines including the computer …

保存引用被引用次数：44 相关文章所有 6 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Smallcap: lightweight image captioning prompted with retrieval augmentation

R Ramos, B Martins, D Elliott… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent advances in image captioning have focused on scaling the data and model size,
substantially increasing the cost of pre-training and finetuning. As an alternative to large …

保存引用被引用次数：97 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Implicit identity representation conditioned memory compensation network for talking head video generation

FT Hong, D Xu - Proceedings of the IEEE/CVF International …, 2023 - openaccess.thecvf.com

Talking head video generation aims to animate a human face in a still image with dynamic
poses and expressions using motion information derived from a target-driving video, while …

保存引用被引用次数：32 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Retrieval-augmented generation for ai-generated content: A survey

P Zhao, H Zhang, Q Yu, Z Wang, Y Geng, F Fu… - arxiv preprint arxiv …, 2024 - arxiv.org

The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by
advancements in model algorithms, scalable foundation model architectures, and the …

保存引用被引用次数：179 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Deecap: Dynamic early exiting for efficient image captioning

Z Fei, X Yan, S Wang, Q Tian - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com

Both accuracy and efficiency are crucial for image captioning in real-world scenarios.
Although Transformer-based models have gained significant improved captioning …

保存引用被引用次数：49 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

J Li, DM Vo, A Sugimoto… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Large language models (LLMs)-based image captioning has the capability of describing
objects not explicitly observed in training data; yet novel objects occur frequently …

保存引用被引用次数：19 相关文章所有 3 个版本 HTML 版

Memory-based augmentation network for video captioning

S **g, H Zhang, P Zeng, L Gao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Video captioning focuses on generating natural language descriptions according to the
video content. Existing works mainly explore this multimodal learning with the paired source …

保存引用被引用次数：27 相关文章所有 2 个版本

[Free GPT-4]

[PDF] aaai.org

Attention-aligned transformer for image captioning

Z Fei - proceedings of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org

Recently, attention-based image captioning models, which are expected to ground correct
image regions for proper word generations, have achieved remarkable performance …

保存引用被引用次数：39 相关文章所有 4 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Retrieval-augmented image captioning

R Ramos, D Elliott, B Martins - arxiv preprint arxiv:2302.08268, 2023 - arxiv.org

Inspired by retrieval-augmented language generation and pretrained Vision and Language
(V&L) encoders, we present a new approach to image captioning that generates sentences …

保存引用被引用次数：33 相关文章所有 3 个版本 HTML 版

Visual cluster grounding for image captioning

W Jiang, M Zhu, Y Fang, G Shi… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

Attention mechanisms have been extensively adopted in vision and language tasks such as
image captioning. It encourages a captioning model to dynamically ground appropriate …

保存引用被引用次数：32 相关文章所有 5 个版本

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Memory-augmented image captioning

Visuals to text: A comprehensive review on automatic image captioning

Smallcap: lightweight image captioning prompted with retrieval augmentation

Implicit identity representation conditioned memory compensation network for talking head video generation

Retrieval-augmented generation for ai-generated content: A survey

Deecap: Dynamic early exiting for efficient image captioning

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

Memory-based augmentation network for video captioning

Attention-aligned transformer for image captioning

Retrieval-augmented image captioning

Visual cluster grounding for image captioning