Deep learning in natural language processing: A state-of-the-art survey

J Chai, A Li - … on Machine Learning and Cybernetics (ICMLC), 2019‏ - ieeexplore.ieee.org
Deep learning raises interests of research community as their overwhelming successes in
information processing such specific tasks as video/speech recognition. In this paper, we …

Timeception for complex action recognition

N Hussein, E Gavves… - Proceedings of the …, 2019‏ - openaccess.thecvf.com
This paper focuses on the temporal aspect for recognizing human activities in videos; an
important visual cue that has long been undervalued. We revisit the conventional definition …

Exploiting feature and class relationships in video categorization with regularized deep neural networks

YG Jiang, Z Wu, J Wang, X Xue… - IEEE transactions on …, 2017‏ - ieeexplore.ieee.org
In this paper, we study the challenging problem of categorizing videos according to high-
level semantics such as the existence of a particular human action or a complex event …

Video generation from text

Y Li, M Min, D Shen, D Carlson, L Carin - Proceedings of the AAAI …, 2018‏ - ojs.aaai.org
Generating videos from text has proven to be a significant challenge for existing generative
models. We tackle this problem by training a conditional generative model to extract both …

Predicting visual features from text for image and video caption retrieval

J Dong, X Li, CGM Snoek - IEEE Transactions on Multimedia, 2018‏ - ieeexplore.ieee.org
This paper strives to find amidst a set of sentences the one best describing the content of a
given image or video. Different from existing works, which rely on a joint subspace for their …

Multi-shot temporal event localization: a benchmark

X Liu, Y Hu, S Bai, F Ding, X Bai… - Proceedings of the …, 2021‏ - openaccess.thecvf.com
Current developments in temporal event or action localization usually target actions
captured by a single camera. However, extensive events or actions in the wild may be …

Soccernet: A scalable dataset for action spotting in soccer videos

S Giancola, M Amine, T Dghaily… - Proceedings of the …, 2018‏ - openaccess.thecvf.com
In this paper, we introduce SoccerNet, a benchmark for action spotting in soccer videos. The
dataset is composed of 500 complete soccer games from six main European leagues …

W2vv++ fully deep learning for ad-hoc video search

X Li, C Xu, G Yang, Z Chen, J Dong - Proceedings of the 27th ACM …, 2019‏ - dl.acm.org
Ad-hoc video search (AVS) is an important yet challenging problem in multimedia retrieval.
Different from previous concept-based methods, we propose a fully deep learning method …

Hawkes processes for events in social media

MA Rizoiu, Y Lee, S Mishra, L **e - Frontiers of multimedia research, 2017‏ - dl.acm.org
This chapter provides an accessible introduction for point processes, and especially Hawkes
processes, for modeling discrete, inter-dependent events over continuous time. We start by …

Omni-sourced webly-supervised learning for video recognition

H Duan, Y Zhao, Y **ong, W Liu, D Lin - European conference on …, 2020‏ - Springer
We introduce OmniSource, a novel framework for leveraging web data to train video
recognition models. OmniSource overcomes the barriers between data formats, such as …