- Academic Search

L Yang, Z Zhang, Y Song, S Hong, R Xu, Y Zhao… - ACM Computing …, 2023 - dl.acm.org

Diffusion models have emerged as a powerful new family of deep generative models with
record-breaking performance in many applications, including image synthesis, video …

Save Cite Cited by 1537 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Foundation Models Defining a New Era in Vision: a Survey and Outlook

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org

Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Save Cite Cited by 133 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Segment everything everywhere all at once

X Zou, J Yang, H Zhang, F Li, L Li… - Advances in …, 2024 - proceedings.neurips.cc

In this work, we present SEEM, a promotable and interactive model for segmenting
everything everywhere all at once in an image. In SEEM, we propose a novel and versatile …

Save Cite Cited by 520 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Emergent correspondence from image diffusion

L Tang, M Jia, Q Wang, CP Phoo… - Advances in Neural …, 2023 - proceedings.neurips.cc

Finding correspondences between images is a fundamental problem in computer vision. In
this paper, we show that correspondence emerges in image diffusion models without any …

Save Cite Cited by 297 Related articles All 12 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip

Q Yu, J He, X Deng, X Shen… - Advances in Neural …, 2023 - proceedings.neurips.cc

Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing
objects from an open set of categories in diverse environments. One way to address this …

Save Cite Cited by 126 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Segment and Recognize Anything at Any Granularity

F Li, H Zhang, P Sun, X Zou, S Liu, C Li, J Yang… - … on Computer Vision, 2024 - Springer

In this work, we introduce Semantic-SAM, an augmented image segmentation foundation for
segmenting and recognizing anything at desired granularities. Compared to the …

Save Cite Cited by 164 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

A simple framework for open-vocabulary segmentation and detection

H Zhang, F Li, X Zou, S Liu, C Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

In this work, we present OpenSeeD, a simple Open-vocabulary Segmentation and Detection
framework that learns from different segmentation and detection datasets. To bridge the gap …

Save Cite Cited by 159 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Recognize anything: A strong image tagging model

Y Zhang, X Huang, J Ma, Z Li, Z Luo… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract We present the Recognize Anything Model (RAM): a strong foundation model for
image tagging. RAM makes a substantial step for foundation models in computer vision …

Save Cite Cited by 188 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nowpublishers.com

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Save Cite Cited by 212 Related articles All 6 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] ieee.org

Transformer-based visual segmentation: A survey

X Li, H Ding, H Yuan, W Zhang, J Pang… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …

Save Cite Cited by 119 Related articles All 3 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Open-vocabulary panoptic segmentation with text-to-image diffusion models

Diffusion models: A comprehensive survey of methods and applications

Foundation Models Defining a New Era in Vision: a Survey and Outlook

Segment everything everywhere all at once

Emergent correspondence from image diffusion

Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip

Segment and Recognize Anything at Any Granularity

A simple framework for open-vocabulary segmentation and detection

Recognize anything: A strong image tagging model

Multimodal foundation models: From specialists to general-purpose assistants

Transformer-based visual segmentation: A survey