Srformer: Permuted self-attention for single image super-resolution
Previous works have shown that increasing the window size for Transformer-based image
super-resolution models (eg, SwinIR) can significantly improve the model performance but …
super-resolution models (eg, SwinIR) can significantly improve the model performance but …
Visual semantic segmentation based on few/zero-shot learning: An overview
Visual semantic segmentation aims at separating a visual sample into diverse blocks with
specific semantic attributes and identifying the category for each block, and it plays a crucial …
specific semantic attributes and identifying the category for each block, and it plays a crucial …
Full-duplex strategy for video object segmentation
Appearance and motion are two important sources of information in video object
segmentation (VOS). Previous methods mainly focus on using simplex solutions, lowering …
segmentation (VOS). Previous methods mainly focus on using simplex solutions, lowering …
Siamese network for RGB-D salient object detection and beyond
Existing RGB-D salient object detection (SOD) models usually treat RGB and depth as
independent information and design separate networks for feature extraction from each …
independent information and design separate networks for feature extraction from each …
Video transformers: A survey
Transformer models have shown great success handling long-range interactions, making
them a promising tool for modeling video. However, they lack inductive biases and scale …
them a promising tool for modeling video. However, they lack inductive biases and scale …
A survey on deep learning technique for video segmentation
Video segmentation—partitioning video frames into multiple segments or objects—plays a
critical role in a broad range of practical applications, from enhancing visual effects in movie …
critical role in a broad range of practical applications, from enhancing visual effects in movie …
A comprehensive survey on video saliency detection with auditory information: the audio-visual consistency perceptual is the key!
Video saliency detection (VSD) aims at fast locating the most attractive
objects/things/patterns in a given video clip. Existing VSD-related works have mainly relied …
objects/things/patterns in a given video clip. Existing VSD-related works have mainly relied …
Dynamic context-sensitive filtering network for video salient object detection
The ability to capture inter-frame dynamics has been critical to the development of video
salient object detection (VSOD). While many works have achieved great success in this field …
salient object detection (VSOD). While many works have achieved great success in this field …
Camoformer: Masked separable attention for camouflaged object detection
How to identify and segment camouflaged objects from the background is challenging.
Inspired by the multi-head self-attention in Transformers, we present a simple masked …
Inspired by the multi-head self-attention in Transformers, we present a simple masked …
Video polyp segmentation: A deep learning perspective
We present the first comprehensive video polyp segmentation (VPS) study in the deep
learning era. Over the years, developments in VPS are not moving forward with ease due to …
learning era. Over the years, developments in VPS are not moving forward with ease due to …