A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective
Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …
Mart: Masked affective representation learning via masked temporal distribution distillation
Limited training data is a long-standing problem for video emotion analysis (VEA). Existing
works leverage the power of large-scale image datasets for transferring while failing to …
works leverage the power of large-scale image datasets for transferring while failing to …
Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration
Transformer-based approaches have achieved promising performance in image restoration
tasks given their ability to model long-range dependencies which is crucial for recovering …
tasks given their ability to model long-range dependencies which is crucial for recovering …
Extdm: Distribution extrapolation diffusion model for video prediction
Video prediction is a challenging task due to its nature of uncertainty especially for
forecasting a long period. To model the temporal dynamics advanced methods benefit from …
forecasting a long period. To model the temporal dynamics advanced methods benefit from …
Lake-red: Camouflaged images generation by latent background knowledge retrieval-augmented diffusion
Camouflaged vision perception is an important vision task with numerous practical
applications. Due to the expensive collection and labeling costs this community struggles …
applications. Due to the expensive collection and labeling costs this community struggles …
3d human mesh reconstruction by learning to sample joint adaptive tokens for transformers
Reconstructing 3D human mesh from a single RGB image is a challenging task due to the
inherent depth ambiguity. Researchers commonly use convolutional neural networks to …
inherent depth ambiguity. Researchers commonly use convolutional neural networks to …
[HTML][HTML] Spatiotemporal correlation based self-adaptive pose estimation in complex scenes
In the current digital era, large-scale Artificial Intelligence (AI) models have fundamentally
changed the landscape of the AI field. These models excel at extracting complex patterns …
changed the landscape of the AI field. These models excel at extracting complex patterns …
Two-stage co-segmentation network based on discriminative representation for recovering human mesh from videos
B Zhang, K Ma, S Wu, Z Yuan - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Recovering 3D human mesh from videos has recently made significant progress. However,
most of the existing methods focus on the temporal consistency of videos, while ignoring the …
most of the existing methods focus on the temporal consistency of videos, while ignoring the …
HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses
We present HumanNeRF-SE a simple yet effective method that synthesizes diverse novel
pose images with simple input. Previous HumanNeRF works require a large number of …
pose images with simple input. Previous HumanNeRF works require a large number of …
Personalized graph generation for monocular 3D human pose and shape estimation
3D human pose and shape estimation from a single RGB image is an appealing yet
challenging task. Due to the graph-like nature of human parametric models, a growing …
challenging task. Due to the graph-like nature of human parametric models, a growing …