- Academic Search

Translating video content to natural language descriptions

Vyhledávat v článcích obsahujících odkaz

Unravelling the impact of generative artificial intelligence (GAI) in industrial applications: A review of scientific and grey literature

AK Kar, PS Varsha, S Rajan - Global Journal of Flexible Systems …, 2023 - Springer

The scope of application of generative artificial intelligence (GAI) in industrial functions is
gaining high prominence in academic and industrial discourses. In this article, we explore …

Uložit Citovat Počet citací tohoto článku: 111 Související články Všechny verze (počet: 3)

[Free GPT-4]
[DeepSeek]

[PDF] cbs.dk

Natural language processing (NLP) in management research: A literature review

Y Kang, Z Cai, CW Tan, Q Huang… - Journal of Management …, 2020 - Taylor & Francis

Natural language processing (NLP) is gaining momentum in management research for its
ability to automatically analyze and comprehend human language. Yet, despite its extensive …

Uložit Citovat Počet citací tohoto článku: 567 Související články Všechny verze (počet: 7)

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

End-to-end generative pretraining for multimodal video captioning

PH Seo, A Nagrani, A Arnab… - Proceedings of the …, 2022 - openaccess.thecvf.com

Recent video and language pretraining frameworks lack the ability to generate sentences.
We present Multimodal Video Generative Pretraining (MV-GPT), a new pretraining …

Uložit Citovat Počet citací tohoto článku: 205 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

End-to-end dense video captioning with parallel decoding

T Wang, R Zhang, Z Lu, F Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com

Dense video captioning aims to generate multiple associated captions with their temporal
locations from the video. Previous methods follow a sophisticated" localize-then-describe" …

Uložit Citovat Počet citací tohoto článku: 214 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Spatio-temporal graph for video captioning with knowledge distillation

B Pan, H Cai, DA Huang, KH Lee… - Proceedings of the …, 2020 - openaccess.thecvf.com

Video captioning is a challenging task that requires a deep understanding of visual scenes.
State-of-the-art methods generate captions using either scene-level or object-level …

Uložit Citovat Počet citací tohoto článku: 341 Související články Všechny verze (počet: 8) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Localizing moments in video with natural language

L Anne Hendricks, O Wang… - Proceedings of the …, 2017 - openaccess.thecvf.com

We consider retrieving a specific temporal segment, or moment, from a video given a natural
language text description. Methods designed to retrieve whole video clips with natural …

Uložit Citovat Počet citací tohoto článku: 1082 Související články Všechny verze (počet: 10) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Msr-vtt: A large video description dataset for bridging video and language

J Xu, T Mei, T Yao, Y Rui - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com

While there has been increasing interest in the task of describing video with natural
language, current computer vision algorithms are still severely limited in terms of the …

Uložit Citovat Počet citací tohoto článku: 2301 Související články Všechny verze (počet: 10) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Visual relationship detection with language priors

C Lu, R Krishna, M Bernstein, L Fei-Fei - … 11–14, 2016, Proceedings, Part I …, 2016 - Springer

Visual relationships capture a wide variety of interactions between pairs of objects in images
(eg “man riding bicycle” and “man pushing bicycle”). Consequently, the set of possible …

Uložit Citovat Počet citací tohoto článku: 1365 Související články Všechny verze (počet: 14)

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Clip4caption: Clip for video caption

M Tang, Z Wang, Z Liu, F Rao, D Li, X Li - Proceedings of the 29th ACM …, 2021 - dl.acm.org

Video captioning is a challenging task since it requires generating sentences describing
various diverse and complex videos. Existing video captioning models lack adequate visual …

Uložit Citovat Počet citací tohoto článku: 155 Související články Všechny verze (počet: 4)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Single-shot multi-person 3d pose estimation from monocular rgb

D Mehta, O Sotnychenko, F Mueller… - … Conference on 3D …, 2018 - ieeexplore.ieee.org

We propose a new single-shot method for multi-person 3D pose estimation in general
scenes from a monocular RGB camera. Our approach uses novel occlusion-robust pose …

Uložit Citovat Počet citací tohoto článku: 500 Související články Všechny verze (počet: 10)

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Translating video content to natural language descriptions

Unravelling the impact of generative artificial intelligence (GAI) in industrial applications: A review of scientific and grey literature

Natural language processing (NLP) in management research: A literature review

End-to-end generative pretraining for multimodal video captioning

End-to-end dense video captioning with parallel decoding

Spatio-temporal graph for video captioning with knowledge distillation

Localizing moments in video with natural language

Msr-vtt: A large video description dataset for bridging video and language

Visual relationship detection with language priors

Clip4caption: Clip for video caption

Single-shot multi-person 3d pose estimation from monocular rgb