OMG-Seg: Is one model good enough for all segmentation?
In this work we address various segmentation tasks each traditionally tackled by distinct or
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …
partially unified models. We propose OMG-Seg One Model that is Good enough to efficiently …
Omg-llava: Bridging image-level, object-level, pixel-level reasoning and understanding
Current universal segmentation methods demonstrate strong capabilities in pixel-level
image and video understanding. However, they lack reasoning abilities and cannot be …
image and video understanding. However, they lack reasoning abilities and cannot be …
Point could mamba: Point cloud learning via state space model
In this work, for the first time, we demonstrate that Mamba-based point cloud methods can
outperform point-based methods. Mamba exhibits strong global modeling capabilities and …
outperform point-based methods. Mamba exhibits strong global modeling capabilities and …
Explore in-context segmentation via latent diffusion models
In-context segmentation has drawn more attention with the introduction of vision foundation
models. Most existing approaches adopt metric learning or masked image modeling to build …
models. Most existing approaches adopt metric learning or masked image modeling to build …
VG4D: Vision-Language Model Goes 4D Video Recognition
Understanding the real world through point cloud video is a crucial aspect of robotics and
autonomous driving systems. However, prevailing methods for 4D point cloud recognition …
autonomous driving systems. However, prevailing methods for 4D point cloud recognition …
USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation
Contrastive learning has achieved great success in skeleton-based representation learning
recently. However, the prevailing methods are predominantly negative-based, necessitating …
recently. However, the prevailing methods are predominantly negative-based, necessitating …
MacDiff: Unified Skeleton Modeling with Masked Conditional Diffusion
Self-supervised learning has proved effective for skeleton-based human action
understanding. However, previous works either rely on contrastive learning that suffers false …
understanding. However, previous works either rely on contrastive learning that suffers false …
Point-In-Context: Understanding Point Cloud via In-Context Learning
With the emergence of large-scale models trained on diverse datasets, in-context learning
has emerged as a promising paradigm for multitasking, notably in natural language …
has emerged as a promising paradigm for multitasking, notably in natural language …
CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition
Skeleton-based multi-entity action recognition is a challenging task aiming to identify
interactive actions or group activities involving multiple diverse entities. Existing models for …
interactive actions or group activities involving multiple diverse entities. Existing models for …
MKTZ: multi-semantic embedding and key frame masking techniques for zero-shot skeleton action recognition
H Chen, S Guo, Z Chen - Multimedia Systems, 2024 - Springer
The fundamental task of zero-shot skeleton-based action recognition is to learn existing
skeletal actions during the training phase and to accurately identify unseen actions during …
skeletal actions during the training phase and to accurately identify unseen actions during …