Emergent correspondence from image diffusion
Finding correspondences between images is a fundamental problem in computer vision. In
this paper, we show that correspondence emerges in image diffusion models without any …
this paper, we show that correspondence emerges in image diffusion models without any …
A tale of two features: Stable diffusion complements dino for zero-shot semantic correspondence
Text-to-image diffusion models have made significant advances in generating and editing
high-quality images. As a result, numerous approaches have explored the ability of diffusion …
high-quality images. As a result, numerous approaches have explored the ability of diffusion …
Towards scalable neural representation for diverse videos
Implicit neural representations (INR) have gained increasing attention in representing 3D
scenes and images, and have been recently applied to encode videos (eg, NeRV, E-NeRV) …
scenes and images, and have been recently applied to encode videos (eg, NeRV, E-NeRV) …
Sd4match: Learning to prompt stable diffusion model for semantic matching
In this paper we address the challenge of matching semantically similar keypoints across
image pairs. Existing research indicates that the intermediate output of the UNet within the …
image pairs. Existing research indicates that the intermediate output of the UNet within the …
Telling left from right: Identifying geometry-aware semantic correspondence
While pre-trained large-scale vision models have shown significant promise for semantic
correspondence their features often struggle to grasp the geometry and orientation of …
correspondence their features often struggle to grasp the geometry and orientation of …
What is Point Supervision Worth in Video Instance Segmentation?
Video instance segmentation (VIS) is a challenging vision task that aims to detect segment
and track objects in videos. Conventional VIS methods rely on densely annotated object …
and track objects in videos. Conventional VIS methods rely on densely annotated object …
Asic: Aligning sparse in-the-wild image collections
We present a method for joint alignment of sparse in-the-wild image collections of an object
category. Most prior works assume either ground-truth keypoint annotations or a large …
category. Most prior works assume either ground-truth keypoint annotations or a large …
Improving semantic correspondence with viewpoint-guided spherical maps
Recent self-supervised models produce visual features that are not only effective at
encoding image-level but also pixel-level semantics. They have been reported to obtain …
encoding image-level but also pixel-level semantics. They have been reported to obtain …
Efficient Semantic Matching with Hypercolumn Correlation
Recent studies show that leveraging the match-wise relationships within the 4D correlation
map yields significant improvements in establishing semantic correspondences-but at the …
map yields significant improvements in establishing semantic correspondences-but at the …
UVIS: Unsupervised Video Instance Segmentation
Video instance segmentation requires classifying segmenting and tracking every object
across video frames. Unlike existing approaches that rely on masks boxes or category labels …
across video frames. Unlike existing approaches that rely on masks boxes or category labels …