Occlusion handling in generic object detection: A review

K Saleh, S Szénási, Z Vámossy - 2021 IEEE 19th World …, 2021 - ieeexplore.ieee.org
The significant power of deep learning networks has led to enormous development in object
detection. Over the last few years, object detector frameworks have achieved tremendous …

Learning environment-aware affordance for 3d articulated object manipulation under occlusions

R Wu, K Cheng, Y Zhao, C Ning… - Advances in Neural …, 2024 - proceedings.neurips.cc
Perceiving and manipulating 3D articulated objects in diverse environments is essential for
home-assistant robots. Recent studies have shown that point-level affordance provides …

A systematic review of object detection from images using deep learning

J Kaur, W Singh - Multimedia Tools and Applications, 2024 - Springer
The development of object detection has led to huge improvements in human interaction
systems. Object detection is a challenging task because it involves many parameters …

Unveiling the potential of structure preserving for weakly supervised object localization

X Pan, Y Gao, Z Lin, F Tang, W Dong… - Proceedings of the …, 2021 - openaccess.thecvf.com
Weakly supervised object localization (WSOL) remains an open problem due to the
deficiency of finding object extent information using a classification network. While prior …

Swapmix: Diagnosing and regularizing the over-reliance on visual context in visual question answering

V Gupta, Z Li, A Kortylewski, C Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract While Visual Question Answering (VQA) has progressed rapidly, previous works
raise concerns about robustness of current VQA models. In this work, we study the …

Umc: A unified bandwidth-efficient and multi-resolution based collaborative perception framework

T Wang, G Chen, K Chen, Z Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Multi-agent collaborative perception (MCP) has recently attracted much attention. It includes
three key processes: communication for sharing, collaboration for integration, and …

Watch or listen: Robust audio-visual speech recognition with visual corruption modeling and reliability scoring

J Hong, M Kim, J Choi, YM Ro - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
This paper deals with Audio-Visual Speech Recognition (AVSR) under multimodal input
corruption situation where audio inputs and visual inputs are both corrupted, which is not …

Modeling image composition for complex scene generation

Z Yang, D Liu, C Wang, J Yang… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
We present a method that achieves state-of-the-art results on challenging (few-shot) layout-
to-image generation tasks by accurately modeling textures, structures and relationships …

Compositional convolutional neural networks: A robust and interpretable model for object recognition under occlusion

A Kortylewski, Q Liu, A Wang, Y Sun… - International Journal of …, 2021 - Springer
Computer vision systems in real-world applications need to be robust to partial occlusion
while also being explainable. In this work, we show that black-box deep convolutional …

MSCAF-net: A general framework for camouflaged object detection via learning multi-scale context-aware features

Y Liu, H Li, J Cheng, X Chen - IEEE Transactions on Circuits …, 2023 - ieeexplore.ieee.org
The aim of camouflaged object detection (COD) is to find objects that are hidden in their
surrounding environment. Due to the factors like low illumination, occlusion, small size and …