A review of deep learning for video captioning
Video captioning (VC) is a fast-moving, cross-disciplinary area of research that comprises
contributions from domains such as computer vision, natural language processing …
contributions from domains such as computer vision, natural language processing …
An outlook into the future of egocentric vision
What will the future be? We wonder! In this survey, we explore the gap between current
research in egocentric vision and the ever-anticipated future, where wearable computing …
research in egocentric vision and the ever-anticipated future, where wearable computing …
Video Summarization Techniques: A Comprehensive Review
The rapid expansion of video content across a variety of industries, including social media,
education, entertainment, and surveillance, has made video summarization an essential …
education, entertainment, and surveillance, has made video summarization an essential …
A comprehensive study of automatic video summarization techniques
D Gupta, A Sharma - Artificial Intelligence Review, 2023 - Springer
Video summarization deals with the generation of a condensed version of the original video
by including meaningful frames or segments while eliminating redundant information. The …
by including meaningful frames or segments while eliminating redundant information. The …
Seeing the unseen: Predicting the first-person camera wearer's location and pose in third-person scenes
Y Wen, KK Singh, M Anderson… - Proceedings of the …, 2021 - openaccess.thecvf.com
Our goal is to predict the camera wearer's location and pose in his/her environment based
on what's captured by the camera wearer's first-person wearable camera. Toward this goal …
on what's captured by the camera wearer's first-person wearable camera. Toward this goal …
Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Videos
Given a multi-view video, which viewpoint is most informative for a human observer?
Existing methods rely on heuristics or expensive``best-view" supervision to answer this …
Existing methods rely on heuristics or expensive``best-view" supervision to answer this …
Switch-a-View: Few-Shot View Selection Learned from Edited Videos
We introduce Switch-a-View, a model that learns to automatically select the viewpoint to
display at each timepoint when creating a how-to video. The key insight of our approach is …
display at each timepoint when creating a how-to video. The key insight of our approach is …
Deep Learning Approach for Seamless Navigation in Multi-View Streaming Applications
Quality of Experience (QoE) in multi-view streaming systems is known to be severely
affected by the latency associated with view-switching procedures. Anticipating the …
affected by the latency associated with view-switching procedures. Anticipating the …
Distributed multi-agent video fast-forwarding
In many intelligent systems, a network of agents collaboratively perceives the environment
for better and more efficient situation awareness. As these agents often have limited …
for better and more efficient situation awareness. As these agents often have limited …
Streamlining Video Summarization with NLP: Techniques, Implementation, and Future Direction
The rapid growth of digital video content presents significant challenges in information
accessibility and consumption, creating a pressing need for efficient video summarization …
accessibility and consumption, creating a pressing need for efficient video summarization …