Backbones-review: Feature extraction networks for deep learning and deep reinforcement learning approaches

O Elharrouss, Y Akbari, N Almaadeed… - arxiv preprint arxiv …, 2022 - arxiv.org
To understand the real world using various types of data, Artificial Intelligence (AI) is the
most used technique nowadays. While finding the pattern within the analyzed data …

Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip

Q Yu, J He, X Deng, X Shen… - Advances in Neural …, 2023 - proceedings.neurips.cc
Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing
objects from an open set of categories in diverse environments. One way to address this …

Oneformer: One transformer to rule universal image segmentation

J Jain, J Li, MT Chiu, A Hassani… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Universal Image Segmentation is not a new concept. Past attempts to unify image
segmentation include scene parsing, panoptic segmentation, and, more recently, new …

Masked-attention mask transformer for universal image segmentation

B Cheng, I Misra, AG Schwing… - Proceedings of the …, 2022 - openaccess.thecvf.com
Image segmentation groups pixels with different semantics, eg, category or instance
membership. Each choice of semantics defines a task. While only the semantics of each task …

A generalist framework for panoptic segmentation of images and videos

T Chen, L Li, S Saxena, G Hinton… - Proceedings of the …, 2023 - openaccess.thecvf.com
Panoptic segmentation assigns semantic and instance ID labels to every pixel of an image.
As permutations of instance IDs are also valid solutions, the task requires learning of high …

Panoptic segmentation: A review

O Elharrouss, S Al-Maadeed, N Subramanian… - arxiv preprint arxiv …, 2021 - arxiv.org
Image segmentation for video analysis plays an essential role in different research fields
such as smart city, healthcare, computer vision and geoscience, and remote sensing …

Cmt-deeplab: Clustering mask transformers for panoptic segmentation

Q Yu, H Wang, D Kim, S Qiao… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract We propose Clustering Mask Transformer (CMT-DeepLab), a transformer-based
framework for panoptic segmentation designed around clustering. It rethinks the existing …

Milestones in autonomous driving and intelligent vehicles—part ii: Perception and planning

L Chen, S Teng, B Li, X Na, Y Li, Z Li… - … on Systems, Man …, 2023 - ieeexplore.ieee.org
A growing interest in autonomous driving (AD) and intelligent vehicles (IVs) is fueled by their
promise for enhanced safety, efficiency, and economic benefits. While previous surveys …

Mp-former: Mask-piloted transformer for image segmentation

H Zhang, F Li, H Xu, S Huang, S Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present a mask-piloted Transformer which improves masked-attention in Mask2Former
for image segmentation. The improvement is based on our observation that Mask2Former …

Moat: Alternating mobile convolution and attention brings strong vision models

C Yang, S Qiao, Q Yu, X Yuan, Y Zhu… - The Eleventh …, 2022 - openreview.net
This paper presents MOAT, a family of neural networks that build on top of MObile
convolution (ie, inverted residual blocks) and ATtention. Unlike the current works that stack …