- Academic Search

T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to
combine with the segment anything model (SAM). This integration enables the detection and …

保存引用被引用数: 240 関連記事全 2 バージョン HTMLバージョン

[Free GPT-4]

[PDF] thecvf.com

Skeletonmae: graph-based masked autoencoder for skeleton sequence pre-training

H Yan, Y Liu, Y Wei, Z Li, G Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Skeleton sequence representation learning has shown great advantages for action
recognition due to its promising ability to model human joints and topology. However, the …

保存引用被引用数: 51 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

Deep learning technique for human parsing: A survey and outlook

L Yang, W Jia, S Li, Q Song - International Journal of Computer Vision, 2024 - Springer

Human parsing aims to partition humans in image or video into multiple pixel-level semantic
parts. In the last decade, it has gained significantly increased interest in the computer vision …

保存引用被引用数: 21 関連記事全 3 バージョン

[Free GPT-4]

[PDF] thecvf.com

Humanmac: Masked motion completion for human motion prediction

LH Chen, J Zhang, Y Li, Y Pang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Human motion prediction is a classical problem in computer vision and computer graphics,
which has a wide range of practical applications. Previous effects achieve great empirical …

保存引用被引用数: 65 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]

[PDF] arxiv.org

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions

J Yang, X Niu, N Jiang, R Zhang, S Huang - European Conference on …, 2024 - Springer

Existing 3D human object interaction (HOI) datasets and models simply align global
descriptions with the long HOI sequence, while lacking a detailed understanding of …

保存引用被引用数: 2 関連記事全 6 バージョン

[Free GPT-4]

[PDF] ieee.org

Unified Human-centric Model, Framework and Benchmark: A Survey

X Zhao, S Sulaiman, WY Leng - IEEE Access, 2024 - ieeexplore.ieee.org

Human-centric Computer Vision Tasks (HCTs) refer to a series of tasks related to the human
body, such as Human Pose Estimation, Pedestrian Tracking, Re-Identification (ReID) …

保存引用関連記事全 2 バージョン

X-pose: Detecting any keypoints

J Yang, A Zeng, R Zhang, L Zhang - European Conference on Computer …, 2024 - Springer

This work aims to address an advanced keypoint detection problem: how to accurately
detect any keypoints in complex real-world scenarios, which involves massive, messy, and …

保存引用被引用数: 2 関連記事全 2 バージョン

[Free GPT-4]

[PDF] arxiv.org

OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing

P Gupta, R Singh, P Shenoy… - European Conference on …, 2024 - Springer

Multi-object multi-part scene segmentation is a challenging task whose complexity scales
exponentially with part granularity and number of scene objects. To address the task, we …

保存引用関連記事全 8 バージョン

From Simple to Complex Scenes: Learning Robust Feature Representations for Accurate Human Parsing

Y Liu, C Wang, M Lu, J Yang, J Gui… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Human parsing has attracted considerable research interest due to its broad potential
applications in the computer vision community. In this paper, we explore several useful …

保存引用被引用数: 7 関連記事全 4 バージョン

[Free GPT-4]

[PDF] arxiv.org

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension

J Yang, W Zeng, S **, L Xu, W Liu, C Qian… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advancements in Multimodal Large Language Models (MLLMs) have greatly
improved their abilities in image understanding. However, these models often struggle with …

保存引用関連記事全 2 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

Semantic human parsing via scalable semantic transfer over multiple label domains

Grounded sam: Assembling open-world models for diverse visual tasks

Skeletonmae: graph-based masked autoencoder for skeleton sequence pre-training

Deep learning technique for human parsing: A survey and outlook

Humanmac: Masked motion completion for human motion prediction

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions

Unified Human-centric Model, Framework and Benchmark: A Survey

X-pose: Detecting any keypoints

OLAF: A Plug-and-Play Framework for Enhanced Multi-object Multi-part Scene Parsing

From Simple to Complex Scenes: Learning Robust Feature Representations for Accurate Human Parsing

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension