- Academic Search

M Abdar, M Kollati, S Kuraparthi… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Video captioning (VC) is a fast-moving, cross-disciplinary area of research that comprises
contributions from domains such as computer vision, natural language processing …

Speichern Zitieren Zitiert von: 21 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] springer.com

Video description: A comprehensive survey of deep learning approaches

G Rafiq, M Rafiq, GS Choi - Artificial Intelligence Review, 2023 - Springer

Video description refers to understanding visual content and transforming that acquired
understanding into automatic textual narration. It bridges the key AI fields of computer vision …

Speichern Zitieren Zitiert von: 30 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]

[PDF] arxiv.org

Adapt: Action-aware driving caption transformer

B **, X Liu, Y Zheng, P Li, H Zhao… - … on Robotics and …, 2023 - ieeexplore.ieee.org

End-to-end autonomous driving has great potential in the transportation industry. However,
the lack of transparency and interpretability of the automatic decision-making process …

Speichern Zitieren Zitiert von: 67 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]

[HTML] nih.gov

Video captioning using global-local representation

L Yan, S Ma, Q Wang, Y Chen, X Zhang… - … on Circuits and …, 2022 - ieeexplore.ieee.org

Video captioning is a challenging task as it needs to accurately transform visual
understanding into natural language description. To date, state-of-the-art methods …

Speichern Zitieren Zitiert von: 97 Ähnliche Artikel Alle 6 Versionen

[Free GPT-4]

[PDF] thecvf.com

Exploring group video captioning with efficient relational approximation

W Lin, T **, Y Wang, W Pan, L Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Current video captioning efforts most focus on describing a single video while the need for
captioning videos in groups has increased considerably. In this study, we propose a new …

Speichern Zitieren Zitiert von: 10 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] aaai.org

Refined semantic enhancement towards frequency diffusion for video captioning

X Zhong, Z Li, S Chen, K Jiang, C Chen… - Proceedings of the AAAI …, 2023 - ojs.aaai.org

Video captioning aims to generate natural language sentences that describe the given video
accurately. Existing methods obtain favorable generation by exploring richer visual …

Speichern Zitieren Zitiert von: 34 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] aclanthology.org

TAVT: Towards Transferable Audio-Visual Text Generation

W Lin, T **, W Pan, L Li, X Cheng… - Proceedings of the …, 2023 - aclanthology.org

Audio-visual text generation aims to understand multi-modality contents and translate them
into texts. Although various transfer learning techniques of text generation have been …

Speichern Zitieren Zitiert von: 12 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] thecvf.com

Dyadformer: A multi-modal transformer for long-range modeling of dyadic interactions

D Curto, A Clapés, J Selva… - Proceedings of the …, 2021 - openaccess.thecvf.com

Personality computing has become an emerging topic in computer vision, due to the wide
range of applications it can be used for. However, most works on the topic have focused on …

Speichern Zitieren Zitiert von: 35 Ähnliche Artikel Alle 12 Versionen HTML-Version

Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey

D Sharma, C Dhiman, D Kumar - Expert Systems with Applications, 2023 - Elsevier

Abstract Automatic Visual Captioning (AVC) generates syntactically and semantically correct
sentences by describing important objects, attributes, and their relationships with each other …

Speichern Zitieren Zitiert von: 14 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] wiley.com Full View

Deep learning and knowledge graph for image/video captioning: A review of datasets, evaluation metrics, and methods

MS Wajid, H Terashima‐Marin, P Najafirad… - Engineering …, 2024 - Wiley Online Library

Generating an image/video caption has always been a fundamental problem of Artificial
Intelligence, which is usually performed using the potential of Deep Learning Methods …

Speichern Zitieren Zitiert von: 16 Ähnliche Artikel Alle 2 Versionen

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

SBAT: Video captioning with sparse boundary-aware transformer

A review of deep learning for video captioning

Video description: A comprehensive survey of deep learning approaches

Adapt: Action-aware driving caption transformer

Video captioning using global-local representation

Exploring group video captioning with efficient relational approximation

Refined semantic enhancement towards frequency diffusion for video captioning

TAVT: Towards Transferable Audio-Visual Text Generation

Dyadformer: A multi-modal transformer for long-range modeling of dyadic interactions

Evolution of visual data captioning Methods, Datasets, and evaluation Metrics: A comprehensive survey

Deep learning and knowledge graph for image/video captioning: A review of datasets, evaluation metrics, and methods