Attention mechanisms in computer vision: A survey
Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …
this observation, attention mechanisms were introduced into computer vision with the aim of …
Image segmentation using deep learning: A survey
Image segmentation is a key task in computer vision and image processing with important
applications such as scene understanding, medical image analysis, robotic perception …
applications such as scene understanding, medical image analysis, robotic perception …
Convolutions die hard: Open-vocabulary segmentation with single frozen convolutional clip
Open-vocabulary segmentation is a challenging task requiring segmenting and recognizing
objects from an open set of categories in diverse environments. One way to address this …
objects from an open set of categories in diverse environments. One way to address this …
Segnext: Rethinking convolutional attention design for semantic segmentation
We present SegNeXt, a simple convolutional network architecture for semantic
segmentation. Recent transformer-based models have dominated the field of se-mantic …
segmentation. Recent transformer-based models have dominated the field of se-mantic …
Visual attention network
While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …
mechanism has recently taken various computer vision areas by storm. However, the 2D …
PIDNet: A real-time semantic segmentation network inspired by PID controllers
Two-branch network architecture has shown its efficiency and effectiveness in real-time
semantic segmentation tasks. However, direct fusion of high-resolution details and low …
semantic segmentation tasks. However, direct fusion of high-resolution details and low …
Rethinking semantic segmentation: A prototype view
Prevalent semantic segmentation solutions, despite their different network designs (FCN
based or attention based) and mask decoding strategies (parametric softmax based or pixel …
based or attention based) and mask decoding strategies (parametric softmax based or pixel …
Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Denseclip: Language-guided dense prediction with context-aware prompting
Recent progress has shown that large-scale pre-training using contrastive image-text pairs
can be a promising alternative for high-quality visual representation learning from natural …
can be a promising alternative for high-quality visual representation learning from natural …
Deep dual-resolution networks for real-time and accurate semantic segmentation of traffic scenes
Using light-weight architectures or reasoning on low-resolution images, recent methods
realize very fast scene parsing, even running at more than 100 FPS on a single GPU …
realize very fast scene parsing, even running at more than 100 FPS on a single GPU …