Deep learning-based human pose estimation: A survey
Human pose estimation aims to locate the human body parts and build human body
representation (eg, body skeleton) from input data such as images and videos. It has drawn …
representation (eg, body skeleton) from input data such as images and videos. It has drawn …
Poseformerv2: Exploring frequency domain for efficient and robust 3d human pose estimation
Recently, transformer-based methods have gained significant success in sequential 2D-to-
3D lifting human pose estimation. As a pioneering work, PoseFormer captures spatial …
3D lifting human pose estimation. As a pioneering work, PoseFormer captures spatial …
Vision-based human pose estimation via deep learning: A survey
Human pose estimation (HPE) has attracted a significant amount of attention from the
computer vision community in the past decades. Moreover, HPE has been applied to various …
computer vision community in the past decades. Moreover, HPE has been applied to various …
Mhformer: Multi-hypothesis transformer for 3d human pose estimation
Estimating 3D human poses from monocular videos is a challenging task due to depth
ambiguity and self-occlusion. Most existing works attempt to solve both issues by exploiting …
ambiguity and self-occlusion. Most existing works attempt to solve both issues by exploiting …
Mixste: Seq2seq mixed spatio-temporal encoder for 3d human pose estimation in video
Recent transformer-based solutions have been introduced to estimate 3D human pose from
2D keypoint sequence by considering body joints among all frames globally to learn spatio …
2D keypoint sequence by considering body joints among all frames globally to learn spatio …
Mst++: Multi-stage spectral-wise transformer for efficient spectral reconstruction
Existing leading methods for spectral reconstruction (SR) focus on designing deeper or
wider convolutional neural networks (CNNs) to learn the end-to-end map** from the RGB …
wider convolutional neural networks (CNNs) to learn the end-to-end map** from the RGB …
Localvit: Bringing locality to vision transformers
We study how to introduce locality mechanisms into vision transformers. The transformer
network originates from machine translation and is particularly good at modelling long-range …
network originates from machine translation and is particularly good at modelling long-range …
Diffpose: Toward more reliable 3d pose estimation
Monocular 3D human pose estimation is quite challenging due to the inherent ambiguity
and occlusion, which often lead to high uncertainty and indeterminacy. On the other hand …
and occlusion, which often lead to high uncertainty and indeterminacy. On the other hand …
Diffusion-based 3d human pose estimation with multi-hypothesis aggregation
In this paper, a novel Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise
reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed for probabilistic 3D …
reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed for probabilistic 3D …
Tokenpose: Learning keypoint tokens for human pose estimation
Human pose estimation deeply relies on visual clues and anatomical constraints between
parts to locate keypoints. Most existing CNN-based methods do well in visual …
parts to locate keypoints. Most existing CNN-based methods do well in visual …