Unravelling the impact of generative artificial intelligence (GAI) in industrial applications: A review of scientific and grey literature

AK Kar, PS Varsha, S Rajan - Global Journal of Flexible Systems …, 2023 - Springer
The scope of application of generative artificial intelligence (GAI) in industrial functions is
gaining high prominence in academic and industrial discourses. In this article, we explore …

Natural language processing (NLP) in management research: A literature review

Y Kang, Z Cai, CW Tan, Q Huang… - Journal of Management …, 2020 - Taylor & Francis
Natural language processing (NLP) is gaining momentum in management research for its
ability to automatically analyze and comprehend human language. Yet, despite its extensive …

End-to-end generative pretraining for multimodal video captioning

PH Seo, A Nagrani, A Arnab… - Proceedings of the …, 2022 - openaccess.thecvf.com
Recent video and language pretraining frameworks lack the ability to generate sentences.
We present Multimodal Video Generative Pretraining (MV-GPT), a new pretraining …

End-to-end dense video captioning with parallel decoding

T Wang, R Zhang, Z Lu, F Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com
Dense video captioning aims to generate multiple associated captions with their temporal
locations from the video. Previous methods follow a sophisticated" localize-then-describe" …

Spatio-temporal graph for video captioning with knowledge distillation

B Pan, H Cai, DA Huang, KH Lee… - Proceedings of the …, 2020 - openaccess.thecvf.com
Video captioning is a challenging task that requires a deep understanding of visual scenes.
State-of-the-art methods generate captions using either scene-level or object-level …

Localizing moments in video with natural language

L Anne Hendricks, O Wang… - Proceedings of the …, 2017 - openaccess.thecvf.com
We consider retrieving a specific temporal segment, or moment, from a video given a natural
language text description. Methods designed to retrieve whole video clips with natural …

Msr-vtt: A large video description dataset for bridging video and language

J Xu, T Mei, T Yao, Y Rui - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com
While there has been increasing interest in the task of describing video with natural
language, current computer vision algorithms are still severely limited in terms of the …

Visual relationship detection with language priors

C Lu, R Krishna, M Bernstein, L Fei-Fei - … 11–14, 2016, Proceedings, Part I …, 2016 - Springer
Visual relationships capture a wide variety of interactions between pairs of objects in images
(eg “man riding bicycle” and “man pushing bicycle”). Consequently, the set of possible …

Clip4caption: Clip for video caption

M Tang, Z Wang, Z Liu, F Rao, D Li, X Li - Proceedings of the 29th ACM …, 2021 - dl.acm.org
Video captioning is a challenging task since it requires generating sentences describing
various diverse and complex videos. Existing video captioning models lack adequate visual …

Single-shot multi-person 3d pose estimation from monocular rgb

D Mehta, O Sotnychenko, F Mueller… - … Conference on 3D …, 2018 - ieeexplore.ieee.org
We propose a new single-shot method for multi-person 3D pose estimation in general
scenes from a monocular RGB camera. Our approach uses novel occlusion-robust pose …