Emergent correspondence from image diffusion

L Tang, M Jia, Q Wang, CP Phoo… - Advances in Neural …, 2023 - proceedings.neurips.cc
Finding correspondences between images is a fundamental problem in computer vision. In
this paper, we show that correspondence emerges in image diffusion models without any …

[PDF][PDF] Deep vit features as dense visual descriptors

S Amir, Y Gandelsman, S Bagon… - arxiv preprint arxiv …, 2021 - dino-vit-features.github.io
We study the use of deep features extracted from a pretrained Vision Transformer (ViT) as
dense visual descriptors. We observe and empirically demonstrate that such features, when …

Reco: Retrieve and co-segment for zero-shot transfer

G Shin, W **e, S Albanie - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Semantic segmentation has a broad range of applications, but its real-world impact has
been significantly limited by the prohibitive annotation costs necessary to enable …

Segmenting objects from relational visual data

X Lu, W Wang, J Shen, DJ Crandall… - IEEE transactions on …, 2021 - ieeexplore.ieee.org
In this article, we model a set of pixelwise object segmentation tasks—automatic video
segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation …

Zero-shot video object segmentation via attentive graph neural networks

W Wang, X Lu, J Shen… - Proceedings of the …, 2019 - openaccess.thecvf.com
This work proposes a novel attentive graph neural network (AGNN) for zero-shot video
object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative …

Overview of temporal action detection based on deep learning

K Hu, C Shen, T Wang, K Xu, Q **a, M **a… - Artificial Intelligence …, 2024 - Springer
Abstract Temporal Action Detection (TAD) aims to accurately capture each action interval in
an untrimmed video and to understand human actions. This paper comprehensively surveys …

Crnet: Cross-reference networks for few-shot segmentation

W Liu, C Zhang, G Lin, F Liu - Proceedings of the IEEE/CVF …, 2020 - openaccess.thecvf.com
Over the past few years, state-of-the-art image segmentation algorithms are based on deep
convolutional neural networks. To render a deep network with the ability to understand a …

ICNet: Information conversion network for RGB-D based salient object detection

G Li, Z Liu, H Ling - IEEE Transactions on Image Processing, 2020 - ieeexplore.ieee.org
RGB-D based salient object detection (SOD) methods leverage the depth map as a valuable
complementary information for better SOD performance. Previous methods mainly resort to …

A unified transformer framework for group-based segmentation: Co-segmentation, co-saliency detection and video salient object detection

Y Su, J Deng, R Sun, G Lin, H Su… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Humans tend to mine objects by learning from a group of images or several frames of video
since we live in a dynamic world. In the computer vision area, many researchers focus on co …

Dreamscene360: Unconstrained text-to-3d scene generation with panoramic gaussian splatting

S Zhou, Z Fan, D Xu, H Chang, P Chari… - … on Computer Vision, 2024 - Springer
The increasing demand for virtual reality applications has highlighted the significance of
crafting immersive 3D assets. We present a text-to-3D 360∘ scene generation pipeline that …