Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content
With the recent renaissance of deep convolutional neural networks (CNNs), encouraging
breakthroughs have been achieved on the supervised recognition tasks, where each class …
breakthroughs have been achieved on the supervised recognition tasks, where each class …
Expanding language-image pretrained models for general video recognition
Contrastive language-image pretraining has shown great success in learning visual-textual
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
Recent advances in transfer learning for cross-dataset visual recognition: A problem-oriented perspective
This article takes a problem-oriented perspective and presents a comprehensive review of
transfer-learning methods, both shallow and deep, for cross-dataset visual recognition …
transfer-learning methods, both shallow and deep, for cross-dataset visual recognition …
Fine-tuned clip models are efficient video learners
Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …
TN-ZSTAD: Transferable network for zero-shot temporal activity detection
An integral part of video analysis and surveillance is temporal activity detection, which
means to simultaneously recognize and localize activities in long untrimmed videos …
means to simultaneously recognize and localize activities in long untrimmed videos …
Vita-clip: Video and text adaptive clip via multimodal prompting
Adopting contrastive image-text pretrained models like CLIP towards video classification has
gained attention due to its cost-effectiveness and competitive performance. However, recent …
gained attention due to its cost-effectiveness and competitive performance. However, recent …
A survey of zero-shot learning: Settings, methods, and applications
Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …
been seen in training. In practice, many applications require classifying instances whose …
A comprehensive study of deep video action recognition
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …
last decade, we have witnessed great advancements in video action recognition thanks to …
Attentive region embedding network for zero-shot learning
Zero-shot learning (ZSL) aims to classify images from unseen categories, by merely utilizing
seen class images as the training data. Existing works on ZSL mainly leverage the global …
seen class images as the training data. Existing works on ZSL mainly leverage the global …
Elaborative rehearsal for zero-shot action recognition
The growing number of action classes has posed a new challenge for video understanding,
making Zero-Shot Action Recognition (ZSAR) a thriving direction. The ZSAR task aims to …
making Zero-Shot Action Recognition (ZSAR) a thriving direction. The ZSAR task aims to …