Human pose estimation from monocular images: A comprehensive survey
Human pose estimation refers to the estimation of the location of body parts and how they
are connected in an image. Human pose estimation from monocular images has wide …
are connected in an image. Human pose estimation from monocular images has wide …
Monocular human pose estimation: A survey of deep learning-based methods
Vision-based monocular human pose estimation, as one of the most fundamental and
challenging problems in computer vision, aims to obtain posture of the human body from …
challenging problems in computer vision, aims to obtain posture of the human body from …
End-to-end multi-task learning with attention
We propose a novel multi-task learning architecture, which allows learning of task-specific
feature-level attention. Our design, the Multi-Task Attention Network (MTAN), consists of a …
feature-level attention. Our design, the Multi-Task Attention Network (MTAN), consists of a …
Cross-stitch networks for multi-task learning
Multi-task learning in Convolutional Networks has displayed remarkable success in the field
of recognition. This success can be largely attributed to learning shared representations …
of recognition. This success can be largely attributed to learning shared representations …
Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition
We present an algorithm for simultaneous face detection, landmarks localization, pose
estimation and gender recognition using deep convolutional neural networks (CNN). The …
estimation and gender recognition using deep convolutional neural networks (CNN). The …
Region-based convolutional networks for accurate object detection and segmentation
Object detection performance, as measured on the canonical PASCAL VOC Challenge
datasets, plateaued in the final years of the competition. The best-performing methods were …
datasets, plateaued in the final years of the competition. The best-performing methods were …
Hypercolumns for object segmentation and fine-grained localization
Recognition algorithms based on convolutional networks (CNNs) typically use the output of
the last layer as feature representation. However, the information in this layer may be too …
the last layer as feature representation. However, the information in this layer may be too …
Dual super-resolution learning for semantic segmentation
Current state-of-the-art semantic segmentation methods often apply high-resolution input to
attain high performance, which brings large computation budgets and limits their …
attain high performance, which brings large computation budgets and limits their …
Viewpoints and keypoints
We characterize the problem of pose estimation for rigid objects in terms of determining
viewpoint to explain coarse pose and keypoint prediction to capture the finer details. We …
viewpoint to explain coarse pose and keypoint prediction to capture the finer details. We …
Pastanet: Toward human activity knowledge engine
Existing image-based activity understanding methods mainly adopt direct map**, ie from
image to activity concepts, which may encounter performance bottleneck since the huge …
image to activity concepts, which may encounter performance bottleneck since the huge …