A systematic review of object detection from images using deep learning

J Kaur, W Singh - Multimedia Tools and Applications, 2024 - Springer
The development of object detection has led to huge improvements in human interaction
systems. Object detection is a challenging task because it involves many parameters …

Occlusion handling in generic object detection: A review

K Saleh, S Szénási, Z Vámossy - 2021 IEEE 19th World …, 2021 - ieeexplore.ieee.org
The significant power of deep learning networks has led to enormous development in object
detection. Over the last few years, object detector frameworks have achieved tremendous …

MSCAF-Net: A general framework for camouflaged object detection via learning multi-scale context-aware features

Y Liu, H Li, J Cheng, X Chen - IEEE Transactions on Circuits …, 2023 - ieeexplore.ieee.org
The aim of camouflaged object detection (COD) is to find objects that are hidden in their
surrounding environment. Due to the factors like low illumination, occlusion, small size and …

Umc: A unified bandwidth-efficient and multi-resolution based collaborative perception framework

T Wang, G Chen, K Chen, Z Liu… - Proceedings of the …, 2023 - openaccess.thecvf.com
Multi-agent collaborative perception (MCP) has recently attracted much attention. It includes
three key processes: communication for sharing, collaboration for integration, and …

Watch or listen: Robust audio-visual speech recognition with visual corruption modeling and reliability scoring

J Hong, M Kim, J Choi, YM Ro - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
This paper deals with Audio-Visual Speech Recognition (AVSR) under multimodal input
corruption situation where audio inputs and visual inputs are both corrupted, which is not …

Unveiling the potential of structure preserving for weakly supervised object localization

X Pan, Y Gao, Z Lin, F Tang, W Dong… - Proceedings of the …, 2021 - openaccess.thecvf.com
Weakly supervised object localization (WSOL) remains an open problem due to the
deficiency of finding object extent information using a classification network. While prior …

Cooperative perception with V2V communication for autonomous vehicles

H Ngo, H Fang, H Wang - IEEE Transactions on Vehicular …, 2023 - ieeexplore.ieee.org
Occlusion is a critical problem in the Autonomous Driving System. Solving this problem
requires robust collaboration among autonomous vehicles traveling on the same roads …

Swapmix: Diagnosing and regularizing the over-reliance on visual context in visual question answering

V Gupta, Z Li, A Kortylewski, C Zhang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Abstract While Visual Question Answering (VQA) has progressed rapidly, previous works
raise concerns about robustness of current VQA models. In this work, we study the …

Modeling image composition for complex scene generation

Z Yang, D Liu, C Wang, J Yang… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
We present a method that achieves state-of-the-art results on challenging (few-shot) layout-
to-image generation tasks by accurately modeling textures, structures and relationships …

Track initialization and re-identification for 3D multi-view multi-object tracking

L Van Ma, TTD Nguyen, BN Vo, H Jang, M Jeon - Information Fusion, 2024 - Elsevier
We propose a 3D multi-object tracking (MOT) solution using only 2D detections from
monocular cameras, which automatically initiates/terminates tracks as well as resolves track …