Google Академія

N Aafaq, A Mian, W Liu, SZ Gilani, M Shah - ACM Computing Surveys …, 2019 - dl.acm.org

Video description is the automatic generation of natural language sentences that describe
the contents of a given video. It has applications in human-robot interaction, hel** the …

Зберегти Послатися Цитовано в 255 джерелах Пов’язані статті Кількість версій: 9

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] A comprehensive review on multiple instance learning

S Fatima, S Ali, Зберегти Послатися Цитовано в 1519 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multiple instance learning: A survey of problem characteristics and applications

MA Carbonneau, V Cheplygina, E Granger, G Gagnon - Pattern recognition, 2018 - Elsevier

Multiple instance learning (MIL) is a form of weakly supervised learning where training
instances are arranged in sets, called bags, and a label is provided for the entire bag. This …

Зберегти Послатися Цитовано в 797 джерелах Пов’язані статті Кількість версій: 14

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning

N Aafaq, N Akhtar, W Liu, SZ Gilani… - Proceedings of the …, 2019 - openaccess.thecvf.com

Automatic generation of video captions is a fundamental challenge in computer vision.
Recent techniques typically employ a combination of Convolutional Neural Networks …

Зберегти Послатися Цитовано в 291 джерелах Пов’язані статті Кількість версій: 11 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Multilevel language and vision integration for text-to-clip retrieval

H Xu, K He, BA Plummer, L Sigal, S Sclaroff… - Proceedings of the …, 2019 - ojs.aaai.org

We address the problem of text-based activity retrieval in video. Given a sentence describing
an activity, our task is to retrieve matching clips from an untrimmed video. To capture the …

Зберегти Послатися Цитовано в 349 джерелах Пов’язані статті Кількість версій: 10 Показати у форматі HTML

Describing video with attention-based bidirectional LSTM

Y Bin, Y Yang, F Shen, N **e… - IEEE transactions on …, 2018 - ieeexplore.ieee.org

Video captioning has been attracting broad research attention in the multimedia community.
However, most existing approaches heavily rely on static visual information or partially …

Зберегти Послатися Цитовано в 274 джерелах Пов’язані статті Кількість версій: 5

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Video paragraph captioning using hierarchical recurrent neural networks

H Yu, J Wang, Z Huang, Y Yang… - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com

We present an approach that exploits hierarchical Recurrent Neural Networks (RNNs) to
tackle the video captioning problem, ie, generating one or multiple sentences to describe a …

Зберегти Послатися Цитовано в 735 джерелах Пов’язані статті Кількість версій: 14 Показати у форматі HTML

Video captioning by adversarial LSTM

Y Yang, J Zhou, J Ai, Y Bin, A Hanjalic… - … on Image Processing, 2018 - ieeexplore.ieee.org

In this paper, we propose a novel approach to video captioning based on adversarial
learning and long short-term memory (LSTM). With this solution concept, we aim at …

Зберегти Послатися Цитовано в 220 джерелах Пов’язані статті Кількість версій: 7

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Weakly supervised dense event captioning in videos

X Duan, W Huang, C Gan, J Wang… - Advances in Neural …, 2018 - proceedings.neurips.cc

Dense event captioning aims to detect and describe all events of interest contained in a
video. Despite the advanced development in this area, existing methods tackle this task by …

Зберегти Послатися Цитовано в 180 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

A multi-scale multiple instance video description network

Video description: A survey of methods, datasets, and evaluation metrics

[HTML][HTML] A comprehensive review on multiple instance learning

Multiple instance learning: A survey of problem characteristics and applications

Spatio-temporal dynamics and semantic attribute enriched visual encoding for video captioning

Multilevel language and vision integration for text-to-clip retrieval

Describing video with attention-based bidirectional LSTM

Video paragraph captioning using hierarchical recurrent neural networks

Video captioning by adversarial LSTM

Weakly supervised dense event captioning in videos