DySeT: A Dynamic Masked Self-distillation Approach for Robust Trajectory Prediction
The lack of generalization capability of behavior prediction models for autonomous vehicles
is a crucial concern for safe motion planning. One way to address this is via self-supervised …
is a crucial concern for safe motion planning. One way to address this is via self-supervised …
Masked Image Modeling: A Survey
In this work, we survey recent studies on masked image modeling (MIM), an approach that
emerged as a powerful self-supervised learning technique in computer vision. The MIM task …
emerged as a powerful self-supervised learning technique in computer vision. The MIM task …
Unified Human-centric Model, Framework and Benchmark: A Survey
Human-centric Computer Vision Tasks (HCTs) refer to a series of tasks related to the human
body, such as Human Pose Estimation, Pedestrian Tracking, Re-Identification (ReID) …
body, such as Human Pose Estimation, Pedestrian Tracking, Re-Identification (ReID) …
PoseEmbroider: Towards a 3D, Visual, Semantic-Aware Human Pose Representation
Aligning multiple modalities in a latent space, such as images and texts, has shown to
produce powerful semantic visual representations, fueling tasks like image captioning, text …
produce powerful semantic visual representations, fueling tasks like image captioning, text …
Self-Supervised Learning of Whole and Component-Based Semantic Representations for Person Re-Identification
Person Re-Identification (ReID) is a challenging problem, focusing on identifying individuals
across diverse settings. However, previous ReID methods primarily concentrated on a single …
across diverse settings. However, previous ReID methods primarily concentrated on a single …
SPSNet: semantic-guided perspective shift network for robust person re-identification in drone imagery
H Wei, Q Li, J Pan, J Chen, Y Zhang, L Qi, Y Zhou - The Visual Computer, 2024 - Springer
Person re-identification using drone technology is increasingly important but faces
challenges due to morphological compression and perspective distortions. This study …
challenges due to morphological compression and perspective distortions. This study …
RefHCM: A Unified Model for Referring Perceptions in Human-Centric Scenarios
Human-centric perceptions play a crucial role in real-world applications. While recent
human-centric works have achieved impressive progress, these efforts are often constrained …
human-centric works have achieved impressive progress, these efforts are often constrained …
Multi Positive Contrastive Learning with Pose-Consistent Generated Images
S Inayoshi, AR Widya, S Ozaki, J Otsuka… - arxiv preprint arxiv …, 2024 - arxiv.org
Model pre-training has become essential in various recognition tasks. Meanwhile, with the
remarkable advancements in image generation models, pre-training methods utilizing …
remarkable advancements in image generation models, pre-training methods utilizing …
Spatio-Temporal Side Tuning Pre-trained Foundation Models for Video-based Pedestrian Attribute Recognition
Existing pedestrian attribute recognition (PAR) algorithms are mainly developed based on a
static image, however, the performance is unreliable in challenging scenarios, such as …
static image, however, the performance is unreliable in challenging scenarios, such as …