Deep learning-based human pose estimation: A survey

C Zheng, W Wu, C Chen, T Yang, S Zhu, J Shen… - ACM Computing …, 2023 - dl.acm.org
Human pose estimation aims to locate the human body parts and build human body
representation (eg, body skeleton) from input data such as images and videos. It has drawn …

Poseformerv2: Exploring frequency domain for efficient and robust 3d human pose estimation

Q Zhao, C Zheng, M Liu, P Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, transformer-based methods have gained significant success in sequential 2D-to-
3D lifting human pose estimation. As a pioneering work, PoseFormer captures spatial …

Vision-based human pose estimation via deep learning: A survey

G Lan, Y Wu, F Hu, Q Hao - IEEE Transactions on Human …, 2022 - ieeexplore.ieee.org
Human pose estimation (HPE) has attracted a significant amount of attention from the
computer vision community in the past decades. Moreover, HPE has been applied to various …

Mhformer: Multi-hypothesis transformer for 3d human pose estimation

W Li, H Liu, H Tang, P Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Estimating 3D human poses from monocular videos is a challenging task due to depth
ambiguity and self-occlusion. Most existing works attempt to solve both issues by exploiting …

Mixste: Seq2seq mixed spatio-temporal encoder for 3d human pose estimation in video

J Zhang, Z Tu, J Yang, Y Chen… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Recent transformer-based solutions have been introduced to estimate 3D human pose from
2D keypoint sequence by considering body joints among all frames globally to learn spatio …

Mst++: Multi-stage spectral-wise transformer for efficient spectral reconstruction

Y Cai, J Lin, Z Lin, H Wang, Y Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Existing leading methods for spectral reconstruction (SR) focus on designing deeper or
wider convolutional neural networks (CNNs) to learn the end-to-end map** from the RGB …

Localvit: Bringing locality to vision transformers

Y Li, K Zhang, J Cao, R Timofte, L Van Gool - arxiv preprint arxiv …, 2021 - arxiv.org
We study how to introduce locality mechanisms into vision transformers. The transformer
network originates from machine translation and is particularly good at modelling long-range …

Diffpose: Toward more reliable 3d pose estimation

J Gong, LG Foo, Z Fan, Q Ke… - Proceedings of the …, 2023 - openaccess.thecvf.com
Monocular 3D human pose estimation is quite challenging due to the inherent ambiguity
and occlusion, which often lead to high uncertainty and indeterminacy. On the other hand …

Diffusion-based 3d human pose estimation with multi-hypothesis aggregation

W Shan, Z Liu, X Zhang, Z Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, a novel Diffusion-based 3D Pose estimation (D3DP) method with Joint-wise
reProjection-based Multi-hypothesis Aggregation (JPMA) is proposed for probabilistic 3D …

Tokenpose: Learning keypoint tokens for human pose estimation

Y Li, S Zhang, Z Wang, S Yang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Human pose estimation deeply relies on visual clues and anatomical constraints between
parts to locate keypoints. Most existing CNN-based methods do well in visual …