Savi++: Towards end-to-end object-centric learning from real-world videos

G Elsayed, A Mahendran… - Advances in …, 2022 - proceedings.neurips.cc
The visual world can be parsimoniously characterized in terms of distinct entities with sparse
interactions. Discovering this compositional structure in dynamic visual scenes has proven …

Reco: Retrieve and co-segment for zero-shot transfer

G Shin, W **e, S Albanie - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Semantic segmentation has a broad range of applications, but its real-world impact has
been significantly limited by the prohibitive annotation costs necessary to enable …

Pop-3d: Open-vocabulary 3d occupancy prediction from images

A Vobecky, O Siméoni, D Hurych… - Advances in …, 2023 - proceedings.neurips.cc
We describe an approach to predict open-vocabulary 3D semantic voxel occupancy map
from input 2D images with the objective of enabling 3D grounding, segmentation and …

Dense 2D-3D Indoor Prediction with Sound via Aligned Cross-Modal Distillation

H Yun, J Na, G Kim - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Sound can convey significant information for spatial reasoning in our daily lives. To endow
deep networks with such ability, we address the challenge of dense indoor prediction with …

Namedmask: Distilling segmenters from complementary foundation models

G Shin, W **e, S Albanie - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
The goal of this work is to segment and name regions of images without access to pixel-level
labels during training. To tackle this task, we construct segmenters by distilling the …

Zero-shot unsupervised transfer instance segmentation

G Shin, S Albanie, W **e - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Segmentation is a core computer vision competency, with applications spanning a broad
range of scientifically and economically valuable domains. To date, however, the prohibitive …

Unsupervised object localization in the era of self-supervised vits: A survey

O Siméoni, É Zablocki, S Gidaris, G Puy… - International Journal of …, 2024 - Springer
The recent enthusiasm for open-world vision systems show the high interest of the
community to perform perception tasks outside of the closed-vocabulary benchmark setups …

OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

S Sirko-Galouchenko, A Boulch… - Proceedings of the …, 2024 - openaccess.thecvf.com
We introduce a self-supervised pretraining method called OccFeat for camera-only Bird's-
Eye-View (BEV) segmentation networks. With OccFeat we pretrain a BEV network via …

Namedmask: Distilling segmenters from complementary foundation models

G Shin, W **e, S Albanie - arxiv preprint arxiv:2209.11228, 2022 - arxiv.org
The goal of this work is to segment and name regions of images without access to pixel-level
labels during training. To tackle this task, we construct segmenters by distilling the …

Semantic segmentation of urban environments: Leveraging U-Net deep learning model for cityscape image analysis

TS Arulananth, PG Kuppusamy, RK Ayyasamy… - Plos one, 2024 - journals.plos.org
Semantic segmentation of cityscapes via deep learning is an essential and game-changing
research topic that offers a more nuanced comprehension of urban landscapes. Deep …