Temporal segment networks: Towards good practices for deep action recognition
Deep convolutional networks have achieved great success for visual recognition in still
images. However, for action recognition in videos, the advantage over traditional methods is …
images. However, for action recognition in videos, the advantage over traditional methods is …
Prompting visual-language models for efficient video understanding
Image-based visual-language (I-VL) pre-training has shown great success for learning joint
visual-textual representations from large-scale web data, revealing remarkable ability for …
visual-textual representations from large-scale web data, revealing remarkable ability for …
Recent advances in zero-shot recognition: Toward data-efficient understanding of visual content
With the recent renaissance of deep convolutional neural networks (CNNs), encouraging
breakthroughs have been achieved on the supervised recognition tasks, where each class …
breakthroughs have been achieved on the supervised recognition tasks, where each class …
Transfer learning and its extensive appositeness in human activity recognition: A survey
In this competitive world, the supervision and monitoring of human resources are primary
and necessary tasks to drive context-aware applications. Advancement in sensor and …
and necessary tasks to drive context-aware applications. Advancement in sensor and …
Graph convolutional networks for temporal action localization
Most state-of-the-art action localization systems process each action proposal individually,
without explicitly exploiting their relations during learning. However, the relations between …
without explicitly exploiting their relations during learning. However, the relations between …
A survey of zero-shot learning: Settings, methods, and applications
Most machine-learning methods focus on classifying instances whose classes have already
been seen in training. In practice, many applications require classifying instances whose …
been seen in training. In practice, many applications require classifying instances whose …
An empirical study and analysis of generalized zero-shot learning for object recognition in the wild
We investigate the problem of generalized zero-shot learning (GZSL). GZSL relaxes the
unrealistic assumption in conventional zero-shot learning (ZSL) that test data belong only to …
unrealistic assumption in conventional zero-shot learning (ZSL) that test data belong only to …
I know the relationships: Zero-shot action recognition via two-stream graph convolutional networks and knowledge graphs
Recently, with the ever-growing action categories, zero-shot action recognition (ZSAR) has
been achieved by automatically mining the underlying concepts (eg, actions, attributes) in …
been achieved by automatically mining the underlying concepts (eg, actions, attributes) in …
Human mesh recovery from monocular images via a skeleton-disentangled representation
Y Sun, Y Ye, W Liu, W Gao, Y Fu… - Proceedings of the …, 2019 - openaccess.thecvf.com
We describe an end-to-end method for recovering 3D human body mesh from single images
and monocular videos. Different from the existing methods try to obtain all the complex 3D …
and monocular videos. Different from the existing methods try to obtain all the complex 3D …
Tore: Token reduction for efficient human mesh recovery with transformer
In this paper, we introduce a set of simple yet effective TOken REduction (TORE) strategies
for Transformer-based Human Mesh Recovery from monocular images. Current SOTA …
for Transformer-based Human Mesh Recovery from monocular images. Current SOTA …