Emergent correspondence from image diffusion
Finding correspondences between images is a fundamental problem in computer vision. In
this paper, we show that correspondence emerges in image diffusion models without any …
this paper, we show that correspondence emerges in image diffusion models without any …
[PDF][PDF] Deep vit features as dense visual descriptors
We study the use of deep features extracted from a pretrained Vision Transformer (ViT) as
dense visual descriptors. We observe and empirically demonstrate that such features, when …
dense visual descriptors. We observe and empirically demonstrate that such features, when …
Reco: Retrieve and co-segment for zero-shot transfer
Semantic segmentation has a broad range of applications, but its real-world impact has
been significantly limited by the prohibitive annotation costs necessary to enable …
been significantly limited by the prohibitive annotation costs necessary to enable …
Segmenting objects from relational visual data
In this article, we model a set of pixelwise object segmentation tasks—automatic video
segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation …
segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation …
Zero-shot video object segmentation via attentive graph neural networks
This work proposes a novel attentive graph neural network (AGNN) for zero-shot video
object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative …
object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative …
Overview of temporal action detection based on deep learning
K Hu, C Shen, T Wang, K Xu, Q **a, M **a… - Artificial Intelligence …, 2024 - Springer
Abstract Temporal Action Detection (TAD) aims to accurately capture each action interval in
an untrimmed video and to understand human actions. This paper comprehensively surveys …
an untrimmed video and to understand human actions. This paper comprehensively surveys …
Crnet: Cross-reference networks for few-shot segmentation
Over the past few years, state-of-the-art image segmentation algorithms are based on deep
convolutional neural networks. To render a deep network with the ability to understand a …
convolutional neural networks. To render a deep network with the ability to understand a …
ICNet: Information conversion network for RGB-D based salient object detection
RGB-D based salient object detection (SOD) methods leverage the depth map as a valuable
complementary information for better SOD performance. Previous methods mainly resort to …
complementary information for better SOD performance. Previous methods mainly resort to …
A unified transformer framework for group-based segmentation: Co-segmentation, co-saliency detection and video salient object detection
Humans tend to mine objects by learning from a group of images or several frames of video
since we live in a dynamic world. In the computer vision area, many researchers focus on co …
since we live in a dynamic world. In the computer vision area, many researchers focus on co …
Dreamscene360: Unconstrained text-to-3d scene generation with panoramic gaussian splatting
The increasing demand for virtual reality applications has highlighted the significance of
crafting immersive 3D assets. We present a text-to-3D 360∘ scene generation pipeline that …
crafting immersive 3D assets. We present a text-to-3D 360∘ scene generation pipeline that …