Camoformer: Masked separable attention for camouflaged object detection

B Yin, X Zhang, DP Fan, S Jiao… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
How to identify and segment camouflaged objects from the background is challenging.
Inspired by the multi-head self-attention in Transformers, we present a simple masked …

Sigma: Siamese mamba network for multi-modal semantic segmentation

Z Wan, P Zhang, Y Wang, S Yong, S Stepputtis… - arxiv preprint arxiv …, 2024 - arxiv.org
Multi-modal semantic segmentation significantly enhances AI agents' perception and scene
understanding, especially under adverse conditions like low-light or overexposed …

Temo: Towards text-driven 3d stylization for multi-object meshes

X Zhang, BW Yin, Y Chen, Z Lin, Y Li… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent progress in the text-driven 3D stylization of a single object has been considerably
promoted by CLIP-based methods. However the stylization of multi-object 3D scenes is still …

Multimodal feature-guided pre-training for RGB-T perception

J Ouyang, P **, Q Wang - IEEE Journal of Selected Topics in …, 2024 - ieeexplore.ieee.org
Wide-range multiscale object detection for multispectral scene perception from a drone
perspective is challenging. Previous RGB-T perception methods directly use backbone …

Asymformer: Asymmetrical cross-modal representation learning for mobile platform real-time rgb-d semantic segmentation

S Du, W Wang, R Guo, R Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Understanding indoor scenes is crucial for urban studies. Considering the dynamic nature of
indoor environments effective semantic segmentation requires both real-time operation and …

Efficient multimodal semantic segmentation via dual-prompt learning

S Dong, Y Feng, Q Yang, Y Huang… - 2024 IEEE/RSJ …, 2024 - ieeexplore.ieee.org
Multimodal (eg, RGB-Depth/RGB-Thermal) fusion has shown great potential for improving
semantic segmentation in complex scenes (eg, indoor/low-light conditions). Existing …

Corrmatch: Label propagation via correlation matching for semi-supervised semantic segmentation

B Sun, Y Yang, L Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
This paper presents a simple but performant semi-supervised semantic segmentation
approach called CorrMatch. Previous approaches mostly employ complicated training …

Strip R-CNN: Large Strip Convolution for Remote Sensing Object Detection

X Yuan, ZH Zheng, Y Li, X Liu, L Liu, X Li… - arxiv preprint arxiv …, 2025 - arxiv.org
While witnessed with rapid development, remote sensing object detection remains
challenging for detecting high aspect ratio objects. This paper shows that large strip …

FasterSal: Robust and Real-time Single-Stream Architecture for RGB-D Salient Object Detection

J Zhang, R Zhang, L Xu, X Lu, Y Yu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
RGB-D Salient Object Detection (SOD) aims to segment the most prominent areas and
objects in a given pair of RGB and depth images. Most current models adopt a dual-stream …

Enhancing representations through heterogeneous self-supervised learning

ZY Li, BW Yin, Y Liu, L Liu, MM Cheng - arxiv preprint arxiv:2310.05108, 2023 - arxiv.org
Incorporating heterogeneous representations from different architectures has facilitated
various vision tasks, eg, some hybrid networks combine transformers and convolutions …