A survey on graph neural networks and graph transformers in computer vision: A task-oriented perspective

C Chen, Y Wu, Q Dai, HY Zhou, M Xu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
Graph Neural Networks (GNNs) have gained momentum in graph representation learning
and boosted the state of the art in a variety of areas, such as data mining (eg, social network …

Mart: Masked affective representation learning via masked temporal distribution distillation

Z Zhang, P Zhao, E Park… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Limited training data is a long-standing problem for video emotion analysis (VEA). Existing
works leverage the power of large-scale image datasets for transferring while failing to …

Adapt or perish: Adaptive sparse transformer with attentive feature refinement for image restoration

S Zhou, D Chen, J Pan, J Shi… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Transformer-based approaches have achieved promising performance in image restoration
tasks given their ability to model long-range dependencies which is crucial for recovering …

Extdm: Distribution extrapolation diffusion model for video prediction

Z Zhang, J Hu, W Cheng, D Paudel… - Proceedings of the …, 2024 - openaccess.thecvf.com
Video prediction is a challenging task due to its nature of uncertainty especially for
forecasting a long period. To model the temporal dynamics advanced methods benefit from …

Lake-red: Camouflaged images generation by latent background knowledge retrieval-augmented diffusion

P Zhao, P Xu, P Qin, DP Fan, Z Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Camouflaged vision perception is an important vision task with numerous practical
applications. Due to the expensive collection and labeling costs this community struggles …

3d human mesh reconstruction by learning to sample joint adaptive tokens for transformers

Y Xue, J Chen, Y Zhang, C Yu, H Ma… - Proceedings of the 30th …, 2022 - dl.acm.org
Reconstructing 3D human mesh from a single RGB image is a challenging task due to the
inherent depth ambiguity. Researchers commonly use convolutional neural networks to …

[HTML][HTML] Spatiotemporal correlation based self-adaptive pose estimation in complex scenes

W Fu, Z Luo, S Liu, J Lloret… - Digital Communications …, 2024 - Elsevier
In the current digital era, large-scale Artificial Intelligence (AI) models have fundamentally
changed the landscape of the AI field. These models excel at extracting complex patterns …

Two-stage co-segmentation network based on discriminative representation for recovering human mesh from videos

B Zhang, K Ma, S Wu, Z Yuan - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Recovering 3D human mesh from videos has recently made significant progress. However,
most of the existing methods focus on the temporal consistency of videos, while ignoring the …

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

C Ma, YL Liu, Z Wang, W Liu, X Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present HumanNeRF-SE a simple yet effective method that synthesizes diverse novel
pose images with simple input. Previous HumanNeRF works require a large number of …

Personalized graph generation for monocular 3D human pose and shape estimation

J Hu, H Zhang, Y Wang, M Ren… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
3D human pose and shape estimation from a single RGB image is an appealing yet
challenging task. Due to the graph-like nature of human parametric models, a growing …