Occlusion handling in generic object detection: A review
The significant power of deep learning networks has led to enormous development in object
detection. Over the last few years, object detector frameworks have achieved tremendous …
detection. Over the last few years, object detector frameworks have achieved tremendous …
Learning environment-aware affordance for 3d articulated object manipulation under occlusions
Perceiving and manipulating 3D articulated objects in diverse environments is essential for
home-assistant robots. Recent studies have shown that point-level affordance provides …
home-assistant robots. Recent studies have shown that point-level affordance provides …
A systematic review of object detection from images using deep learning
J Kaur, W Singh - Multimedia Tools and Applications, 2024 - Springer
The development of object detection has led to huge improvements in human interaction
systems. Object detection is a challenging task because it involves many parameters …
systems. Object detection is a challenging task because it involves many parameters …
Unveiling the potential of structure preserving for weakly supervised object localization
Weakly supervised object localization (WSOL) remains an open problem due to the
deficiency of finding object extent information using a classification network. While prior …
deficiency of finding object extent information using a classification network. While prior …
Swapmix: Diagnosing and regularizing the over-reliance on visual context in visual question answering
Abstract While Visual Question Answering (VQA) has progressed rapidly, previous works
raise concerns about robustness of current VQA models. In this work, we study the …
raise concerns about robustness of current VQA models. In this work, we study the …
Umc: A unified bandwidth-efficient and multi-resolution based collaborative perception framework
Multi-agent collaborative perception (MCP) has recently attracted much attention. It includes
three key processes: communication for sharing, collaboration for integration, and …
three key processes: communication for sharing, collaboration for integration, and …
Watch or listen: Robust audio-visual speech recognition with visual corruption modeling and reliability scoring
This paper deals with Audio-Visual Speech Recognition (AVSR) under multimodal input
corruption situation where audio inputs and visual inputs are both corrupted, which is not …
corruption situation where audio inputs and visual inputs are both corrupted, which is not …
Modeling image composition for complex scene generation
We present a method that achieves state-of-the-art results on challenging (few-shot) layout-
to-image generation tasks by accurately modeling textures, structures and relationships …
to-image generation tasks by accurately modeling textures, structures and relationships …
Compositional convolutional neural networks: A robust and interpretable model for object recognition under occlusion
Computer vision systems in real-world applications need to be robust to partial occlusion
while also being explainable. In this work, we show that black-box deep convolutional …
while also being explainable. In this work, we show that black-box deep convolutional …
MSCAF-net: A general framework for camouflaged object detection via learning multi-scale context-aware features
The aim of camouflaged object detection (COD) is to find objects that are hidden in their
surrounding environment. Due to the factors like low illumination, occlusion, small size and …
surrounding environment. Due to the factors like low illumination, occlusion, small size and …