Fashion meets computer vision: A survey

WH Cheng, S Song, CY Chen, SC Hidayati… - ACM Computing Surveys …, 2021 - dl.acm.org
Fashion is the way we present ourselves to the world and has become one of the world's
largest industries. Fashion, mainly conveyed by vision, has thus attracted much attention …

Tryondiffusion: A tale of two unets

L Zhu, D Yang, T Zhu, F Reda… - Proceedings of the …, 2023 - openaccess.thecvf.com
Given two images depicting a person and a garment worn by another person, our goal is to
generate a visualization of how the garment might look on the input person. A key challenge …

3D human pose estimation via intuitive physics

S Tripathi, L Müller, CHP Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Estimating 3D humans from images often produces implausible bodies that lean, float, or
penetrate the floor. Such methods ignore the fact that bodies are typically supported by the …

Deep hierarchical semantic segmentation

L Li, T Zhou, W Wang, J Li… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Humans are able to recognize structured relations in observation, allowing us to decompose
complex scenes into simpler parts and abstract the visual world in multiple levels. However …

Expressive talking head generation with granular audio-visual control

B Liang, Y Pan, Z Guo, H Zhou… - Proceedings of the …, 2022 - openaccess.thecvf.com
Generating expressive talking heads is essential for creating virtual humans. However,
existing one-or few-shot methods focus on lip-sync and head motion, ignoring the emotional …

AGORA: Avatars in geography optimized for regression analysis

P Patel, CHP Huang, J Tesch… - Proceedings of the …, 2021 - openaccess.thecvf.com
While the accuracy of 3D human pose estimation from images has steadily improved on
benchmark datasets, the best methods still fail in many real-world scenarios. This suggests …

Deep learning technique for human parsing: A survey and outlook

L Yang, W Jia, S Li, Q Song - International Journal of Computer Vision, 2024 - Springer
Human parsing aims to partition humans in image or video into multiple pixel-level semantic
parts. In the last decade, it has gained significantly increased interest in the computer vision …

Self-correction for human parsing

P Li, Y Xu, Y Wei, Y Yang - IEEE Transactions on Pattern …, 2020 - ieeexplore.ieee.org
Labeling pixel-level masks for fine-grained semantic segmentation tasks, eg, human
parsing, remains a challenging task. The ambiguous boundary between different semantic …

Neural point-based graphics

KA Aliev, A Sevastopolsky, M Kolos, D Ulyanov… - Computer Vision–ECCV …, 2020 - Springer
We present a new point-based approach for modeling the appearance of real scenes. The
approach uses a raw point cloud as the geometric representation of a scene, and augments …

DECO: Dense estimation of 3D human-scene contact in the wild

S Tripathi, A Chatterjee, JC Passy… - Proceedings of the …, 2023 - openaccess.thecvf.com
Understanding how humans use physical contact to interact with the world is key to enabling
human-centric artificial intelligence. While inferring 3D contact is crucial for modeling …