Camoformer: Masked separable attention for camouflaged object detection
How to identify and segment camouflaged objects from the background is challenging.
Inspired by the multi-head self-attention in Transformers, we present a simple masked …
Inspired by the multi-head self-attention in Transformers, we present a simple masked …
Sigma: Siamese mamba network for multi-modal semantic segmentation
Multi-modal semantic segmentation significantly enhances AI agents' perception and scene
understanding, especially under adverse conditions like low-light or overexposed …
understanding, especially under adverse conditions like low-light or overexposed …
Temo: Towards text-driven 3d stylization for multi-object meshes
Recent progress in the text-driven 3D stylization of a single object has been considerably
promoted by CLIP-based methods. However the stylization of multi-object 3D scenes is still …
promoted by CLIP-based methods. However the stylization of multi-object 3D scenes is still …
Multimodal feature-guided pre-training for RGB-T perception
Wide-range multiscale object detection for multispectral scene perception from a drone
perspective is challenging. Previous RGB-T perception methods directly use backbone …
perspective is challenging. Previous RGB-T perception methods directly use backbone …
Asymformer: Asymmetrical cross-modal representation learning for mobile platform real-time rgb-d semantic segmentation
Understanding indoor scenes is crucial for urban studies. Considering the dynamic nature of
indoor environments effective semantic segmentation requires both real-time operation and …
indoor environments effective semantic segmentation requires both real-time operation and …
Efficient multimodal semantic segmentation via dual-prompt learning
Multimodal (eg, RGB-Depth/RGB-Thermal) fusion has shown great potential for improving
semantic segmentation in complex scenes (eg, indoor/low-light conditions). Existing …
semantic segmentation in complex scenes (eg, indoor/low-light conditions). Existing …
Corrmatch: Label propagation via correlation matching for semi-supervised semantic segmentation
This paper presents a simple but performant semi-supervised semantic segmentation
approach called CorrMatch. Previous approaches mostly employ complicated training …
approach called CorrMatch. Previous approaches mostly employ complicated training …
Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection
While witnessed with rapid development, remote sensing object detection remains
challenging for detecting high aspect ratio objects. This paper shows that large strip …
challenging for detecting high aspect ratio objects. This paper shows that large strip …
FasterSal: Robust and Real-time Single-Stream Architecture for RGB-D Salient Object Detection
RGB-D Salient Object Detection (SOD) aims to segment the most prominent areas and
objects in a given pair of RGB and depth images. Most current models adopt a dual-stream …
objects in a given pair of RGB and depth images. Most current models adopt a dual-stream …
Enhancing representations through heterogeneous self-supervised learning
Incorporating heterogeneous representations from different architectures has facilitated
various vision tasks, eg, some hybrid networks combine transformers and convolutions …
various vision tasks, eg, some hybrid networks combine transformers and convolutions …