Deep neural network concepts for background subtraction: A systematic review and comparative evaluation

T Bouwmans, S Javed, M Sultana, SK Jung - Neural Networks, 2019 - Elsevier
Conventional neural networks have been demonstrated to be a powerful framework for
background subtraction in video acquired by static cameras. Indeed, the well-known Self …

Video description: A survey of methods, datasets, and evaluation metrics

N Aafaq, A Mian, W Liu, SZ Gilani, M Shah - ACM Computing Surveys …, 2019 - dl.acm.org
Video description is the automatic generation of natural language sentences that describe
the contents of a given video. It has applications in human-robot interaction, hel** the …

Bevt: Bert pretraining of video transformers

R Wang, D Chen, Z Wu, Y Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
This paper studies the BERT pretraining of video transformers. It is a straightforward but
worth-studying extension given the recent success from BERT pretraining of image …

Wilddeepfake: A challenging real-world dataset for deepfake detection

B Zi, M Chang, J Chen, X Ma, YG Jiang - Proceedings of the 28th ACM …, 2020 - dl.acm.org
In recent years, the abuse of a face swap technique called deepfake has raised enormous
public concerns. So far, a large number of deepfake videos (known as" deepfakes") have …

Masked video distillation: Rethinking masked feature modeling for self-supervised video representation learning

R Wang, D Chen, Z Wu, Y Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Benefiting from masked visual modeling, self-supervised video representation learning has
achieved remarkable progress. However, existing methods focus on learning …

Tokenlearner: Adaptive space-time tokenization for videos

M Ryoo, AJ Piergiovanni, A Arnab… - Advances in neural …, 2021 - proceedings.neurips.cc
In this paper, we introduce a novel visual representation learning which relies on a handful
of adaptively learned tokens, and which is applicable to both image and video …

Rethinking spatiotemporal feature learning: Speed-accuracy trade-offs in video classification

S **e, C Sun, J Huang, Z Tu… - Proceedings of the …, 2018 - openaccess.thecvf.com
Despite the steady progress in video analysis led by the adoption of convolutional neural
networks (CNNs), the relative improvement has been less drastic as that in 2D static image …

Eco: Efficient convolutional network for online video understanding

M Zolfaghari, K Singh, T Brox - Proceedings of the …, 2018 - openaccess.thecvf.com
The state of the art in video understanding suffers from two problems:(1) The major part of
reasoning is performed locally in the video, thus missing important relationships within …

ISTVT: interpretable spatial-temporal video transformer for deepfake detection

C Zhao, C Wang, G Hu, H Chen, C Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the rapid development of Deepfake synthesis technology, our information security and
personal privacy have been severely threatened in recent years. To achieve a robust …

Youtube-8m: A large-scale video classification benchmark

S Abu-El-Haija, N Kothari, J Lee, P Natsev… - arxiv preprint arxiv …, 2016 - arxiv.org
Many recent advancements in Computer Vision are attributed to large datasets. Open-
source software packages for Machine Learning and inexpensive commodity hardware …