- Academic Search

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org

Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Save Cite Cited by 134 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Segment anything in high quality

L Ke, M Ye, M Danelljan, YW Tai… - Advances in Neural …, 2024 - proceedings.neurips.cc

Abstract The recent Segment Anything Model (SAM) represents a big leap in scaling up
segmentation models, allowing for powerful zero-shot capabilities and flexible prompting …

Save Cite Cited by 311 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Foundation models in robotics: Applications, challenges, and the future

R Firoozi, J Tucker, S Tian… - … Journal of Robotics …, 2023 - journals.sagepub.com

We survey applications of pretrained foundation models in robotics. Traditional deep
learning models in robotics are trained on small datasets tailored for specific tasks, which …

Save Cite Cited by 124 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Gaussian grou**: Segment and edit anything in 3d scenes

M Ye, M Danelljan, F Yu, L Ke - European Conference on Computer …, 2024 - Springer

Abstract The recent Gaussian Splatting achieves high-quality and real-time novel-view
synthesis of the 3D scenes. However, it is solely concentrated on the appearance and …

Save Cite Cited by 111 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

H Wang, PKA Vasu, F Faghri… - Proceedings of the …, 2024 - openaccess.thecvf.com

The landscape of publicly available vision foundation models (VFMs) such as CLIP and
SAM is expanding rapidly. VFMs are endowed with distinct capabilities stemming from their …

Save Cite Cited by 123 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Repvit: Revisiting mobile cnn from vit perspective

A Wang, H Chen, Z Lin, J Han… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Abstract Recently lightweight Vision Transformers (ViTs) demonstrate superior performance
and lower latency compared with lightweight Convolutional Neural Networks (CNNs) on …

Save Cite Cited by 189 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Tracking anything with decoupled video segmentation

HK Cheng, SW Oh, B Price… - Proceedings of the …, 2023 - openaccess.thecvf.com

Training data for video segmentation are expensive to annotate. This impedes extensions of
end-to-end algorithms to new video segmentation tasks, especially in large-vocabulary …

Save Cite Cited by 140 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

General in-hand object rotation with vision and touch

H Qi, B Yi, S Suresh, M Lambeta, Y Ma… - … on Robot Learning, 2023 - proceedings.mlr.press

We introduce Rotateit, a system that enables fingertip-based object rotation along multiple
axes by leveraging multimodal sensory inputs. Our system is trained in simulation, where it …

Save Cite Cited by 79 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Efficientsam: Leveraged masked image pretraining for efficient segment anything

Y **ong, B Varadarajan, L Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Segment Anything Model (SAM) has emerged as a powerful tool for numerous
vision applications. A key component that drives the impressive performance for zero-shot …

Save Cite Cited by 116 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] acm.org

Weakly-supervised semantic segmentation with image-level labels: from traditional models to foundation models

Z Chen, Q Sun - ACM Computing Surveys, 2023 - dl.acm.org

The rapid development of deep learning has driven significant progress in image semantic
segmentation—a fundamental task in computer vision. Semantic segmentation algorithms …

Save Cite Cited by 10 Related articles All 2 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Faster segment anything: Towards lightweight sam for mobile applications

Foundation Models Defining a New Era in Vision: a Survey and Outlook

Segment anything in high quality

Foundation models in robotics: Applications, challenges, and the future

Gaussian grou**: Segment and edit anything in 3d scenes

Sam-clip: Merging vision foundation models towards semantic and spatial understanding

Repvit: Revisiting mobile cnn from vit perspective

Tracking anything with decoupled video segmentation

General in-hand object rotation with vision and touch

Efficientsam: Leveraged masked image pretraining for efficient segment anything

Weakly-supervised semantic segmentation with image-level labels: from traditional models to foundation models