Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content

Y Fu, T **ang, YG Jiang, X Xue… - IEEE Signal …, 2018 - ieeexplore.ieee.org
With the recent renaissance of deep convolutional neural networks (CNNs), encouraging
breakthroughs have been achieved on the supervised recognition tasks, where each class …

Expanding language-image pretrained models for general video recognition

B Ni, H Peng, M Chen, S Zhang, G Meng, J Fu… - … on Computer Vision, 2022 - Springer
Contrastive language-image pretraining has shown great success in learning visual-textual
joint representation from web-scale data, demonstrating remarkable “zero-shot” …

Recent advances in transfer learning for cross-dataset visual recognition: A problem-oriented perspective

J Zhang, W Li, P Ogunbona, D Xu - ACM Computing Surveys (CSUR), 2019 - dl.acm.org
This article takes a problem-oriented perspective and presents a comprehensive review of
transfer-learning methods, both shallow and deep, for cross-dataset visual recognition …

Fine-tuned clip models are efficient video learners

H Rasheed, MU Khattak, M Maaz… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …

TN-ZSTAD: Transferable network for zero-shot temporal activity detection

L Zhang, X Chang, J Liu, M Luo, Z Li… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org
An integral part of video analysis and surveillance is temporal activity detection, which
means to simultaneously recognize and localize activities in long untrimmed videos …

Vita-clip: Video and text adaptive clip via multimodal prompting

ST Wasim, M Naseer, S Khan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Adopting contrastive image-text pretrained models like CLIP towards video classification has
gained attention due to its cost-effectiveness and competitive performance. However, recent …

A survey of zero-shot learning: Settings, methods, and applications

W Wang, VW Zheng, H Yu, C Miao - ACM Transactions on Intelligent …, 2019 - dl.acm.org
Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …

A comprehensive study of deep video action recognition

Y Zhu, X Li, C Liu, M Zolfaghari, Y **ong, C Wu… - arxiv preprint arxiv …, 2020 - arxiv.org
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …

Attentive region embedding network for zero-shot learning

GS **e, L Liu, X **, F Zhu, Z Zhang… - Proceedings of the …, 2019 - openaccess.thecvf.com
Zero-shot learning (ZSL) aims to classify images from unseen categories, by merely utilizing
seen class images as the training data. Existing works on ZSL mainly leverage the global …

Elaborative rehearsal for zero-shot action recognition

S Chen, D Huang - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
The growing number of action classes has posed a new challenge for video understanding,
making Zero-Shot Action Recognition (ZSAR) a thriving direction. The ZSAR task aims to …