Efficient frequency domain-based transformers for high-quality image deblurring

L Kong, J Dong, J Ge, M Li… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We present an effective and efficient method that explores the properties of Transformers in
the frequency domain for high-quality image deblurring. Our method is motivated by the …

Audio–visual segmentation

J Zhou, J Wang, J Zhang, W Sun, J Zhang… - … on Computer Vision, 2022 - Springer
We propose to explore a new problem called audio-visual segmentation (AVS), in which the
goal is to output a pixel-level map of the object (s) that produce sound at the time of the …

HRTransNet: HRFormer-driven two-modality salient object detection

B Tang, Z Liu, Y Tan, Q He - … on Circuits and Systems for Video …, 2022 - ieeexplore.ieee.org
The High-Resolution Transformer (HRFormer) can maintain high-resolution representation
and share global receptive fields. It is friendly towards salient object detection (SOD) in …

Camoformer: Masked separable attention for camouflaged object detection

B Yin, X Zhang, DP Fan, S Jiao… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org
How to identify and segment camouflaged objects from the background is challenging.
Inspired by the multi-head self-attention in Transformers, we present a simple masked …

The treasure beneath multiple annotations: An uncertainty-aware edge detector

C Zhou, Y Huang, M Pu, Q Guan… - Proceedings of the …, 2023 - openaccess.thecvf.com
Deep learning-based edge detectors heavily rely on pixel-wise labels which are often
provided by multiple annotators. Existing methods fuse multiple annotations using a simple …

Multimodal variational auto-encoder based audio-visual segmentation

Y Mao, J Zhang, M **ang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract We propose an Explicit Conditional Multimodal Variational Auto-Encoder
(ECMVAE) for audio-visual segmentation (AVS), aiming to segment sound sources in the …

Catr: Combinatorial-dependence audio-queried transformer for audio-visual video segmentation

K Li, Z Yang, L Chen, Y Yang, J **ao - Proceedings of the 31st ACM …, 2023 - dl.acm.org
Audio-visual video segmentation (AVVS) aims to generate pixel-level maps of sound-
producing objects within image frames and ensure the maps faithfully adheres to the given …

Avsegformer: Audio-visual segmentation with transformer

S Gao, Z Chen, G Chen, W Wang, T Lu - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Audio-visual segmentation (AVS) aims to locate and segment the sounding objects in a
given video, which demands audio-driven pixel-level scene understanding. The existing …

Annotation-free audio-visual segmentation

J Liu, Y Wang, C Ju, C Ma… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract The objective of Audio-Visual Segmentation (AVS) is to localise the sounding
objects within visual scenes by accurately predicting pixel-wise segmentation masks. To …

Toward deeper understanding of camouflaged object detection

Y Lv, J Zhang, Y Dai, A Li, N Barnes… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Preys in the wild evolve to be camouflaged to avoid being recognized by predators. In this
way, camouflage acts as a key defence mechanism across species that is critical to survival …