DySeT: A Dynamic Masked Self-distillation Approach for Robust Trajectory Prediction

M Pourkeshavarz, J Zhang, A Rasouli - European Conference on …, 2024 - Springer
The lack of generalization capability of behavior prediction models for autonomous vehicles
is a crucial concern for safe motion planning. One way to address this is via self-supervised …

Masked Image Modeling: A Survey

V Hondru, FA Croitoru, S Minaee, RT Ionescu… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we survey recent studies on masked image modeling (MIM), an approach that
emerged as a powerful self-supervised learning technique in computer vision. The MIM task …

Unified Human-centric Model, Framework and Benchmark: A Survey

X Zhao, S Sulaiman, WY Leng - IEEE Access, 2024 - ieeexplore.ieee.org
Human-centric Computer Vision Tasks (HCTs) refer to a series of tasks related to the human
body, such as Human Pose Estimation, Pedestrian Tracking, Re-Identification (ReID) …

PoseEmbroider: Towards a 3D, Visual, Semantic-Aware Human Pose Representation

G Delmas, P Weinzaepfel, F Moreno-Noguer… - … on Computer Vision, 2024 - Springer
Aligning multiple modalities in a latent space, such as images and texts, has shown to
produce powerful semantic visual representations, fueling tasks like image captioning, text …

Self-Supervised Learning of Whole and Component-Based Semantic Representations for Person Re-Identification

S Huang, Y Zhou, R Prabhakar, X Liu, Y Guo… - arxiv preprint arxiv …, 2023 - arxiv.org
Person Re-Identification (ReID) is a challenging problem, focusing on identifying individuals
across diverse settings. However, previous ReID methods primarily concentrated on a single …

SPSNet: semantic-guided perspective shift network for robust person re-identification in drone imagery

H Wei, Q Li, J Pan, J Chen, Y Zhang, L Qi, Y Zhou - The Visual Computer, 2024 - Springer
Person re-identification using drone technology is increasingly important but faces
challenges due to morphological compression and perspective distortions. This study …

RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios

J Huang, R Hou, J Zhao, H Chang, S Shan - arxiv preprint arxiv …, 2024 - arxiv.org
Human-centric perceptions play a crucial role in real-world applications. While recent
human-centric works have achieved impressive progress, these efforts are often constrained …

Multi Positive Contrastive Learning with Pose-Consistent Generated Images

S Inayoshi, AR Widya, S Ozaki, J Otsuka… - arxiv preprint arxiv …, 2024 - arxiv.org
Model pre-training has become essential in various recognition tasks. Meanwhile, with the
remarkable advancements in image generation models, pre-training methods utilizing …

Spatio-Temporal Side Tuning Pre-trained Foundation Models for Video-based Pedestrian Attribute Recognition

X Wang, Q Zhu, J **, J Zhu, F Wang, B Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org
Existing pedestrian attribute recognition (PAR) algorithms are mainly developed based on a
static image, however, the performance is unreliable in challenging scenarios, such as …