Video captioning using global-local representation
Video captioning is a challenging task as it needs to accurately transform visual
understanding into natural language description. To date, state-of-the-art methods …
understanding into natural language description. To date, state-of-the-art methods …
A survey on map-based localization techniques for autonomous vehicles
Autonomous vehicles integrate complex software stacks for realizing the necessary iterative
perception, planning, and action operations. One of the foundational layers of such stacks is …
perception, planning, and action operations. One of the foundational layers of such stacks is …
Solve the puzzle of instance segmentation in videos: A weakly supervised framework with spatio-temporal collaboration
Instance segmentation in videos, which aims to segment and track multiple objects in video
frames, has garnered a flurry of research attention in recent years. In this paper, we present …
frames, has garnered a flurry of research attention in recent years. In this paper, we present …
Online Action Detection in Surveillance Scenarios: A Comprehensive Review and Comparative Study of State-of-the-Art Multi-Object Tracking Methods
J Alikhanov, H Kim - IEEE Access, 2023 - ieeexplore.ieee.org
Online action detection in surveillance scenarios presents considerable challenges,
particularly due to the dynamically changing environments and real-time processing …
particularly due to the dynamically changing environments and real-time processing …
SiamMDM: an adaptive fusion network with dynamic template for real-time satellite video single object tracking
J Yang, Z Pan, Z Wang, B Lei… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Tracking moving targets in satellite videos has attracted wide attention recently. However,
the development of target tracking in satellite videos is much slower than that in general …
the development of target tracking in satellite videos is much slower than that in general …
Tprnet: camouflaged object detection via transformer-induced progressive refinement network
Q Zhang, Y Ge, C Zhang, H Bi - The Visual Computer, 2023 - Springer
Camouflaged object detection (COD) is a challenging task which aims to detect objects
similar to the surrounding environment. In this paper, we propose a transformer-induced …
similar to the surrounding environment. In this paper, we propose a transformer-induced …
Feature aggregated queries for transformer-based video object detectors
Y Cui - Proceedings of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Video object detection needs to solve feature degradation situations that rarely happen in
the image domain. One solution is to use the temporal information and fuse the features from …
the image domain. One solution is to use the temporal information and fuse the features from …
Trep: Transformer-based evidential prediction for pedestrian intention with uncertainty
With rapid development in hardware (sensors and processors) and AI algorithms, automated
driving techniques have entered the public's daily life and achieved great success in …
driving techniques have entered the public's daily life and achieved great success in …
Dynamic feature aggregation for efficient video object detection
Y Cui - Proceedings of the Asian Conference on Computer …, 2022 - openaccess.thecvf.com
Video object detection is a fundamental yet challenging task in computer vision. One
practical solution is to take advantage of temporal information from the video and apply …
practical solution is to take advantage of temporal information from the video and apply …
Dynamic proposals for efficient object detection
Object detection is a basic computer vision task to loccalize and categorize objects in a
given image. Most state-of-the-art detection methods utilize a fixed number of proposals as …
given image. Most state-of-the-art detection methods utilize a fixed number of proposals as …