Video captioning using global-local representation

L Yan, S Ma, Q Wang, Y Chen, X Zhang… - … on Circuits and …, 2022 - ieeexplore.ieee.org
Video captioning is a challenging task as it needs to accurately transform visual
understanding into natural language description. To date, state-of-the-art methods …

A survey on map-based localization techniques for autonomous vehicles

A Chalvatzaras, I Pratikakis… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Autonomous vehicles integrate complex software stacks for realizing the necessary iterative
perception, planning, and action operations. One of the foundational layers of such stacks is …

Solve the puzzle of instance segmentation in videos: A weakly supervised framework with spatio-temporal collaboration

L Yan, Q Wang, S Ma, J Wang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Instance segmentation in videos, which aims to segment and track multiple objects in video
frames, has garnered a flurry of research attention in recent years. In this paper, we present …

Online Action Detection in Surveillance Scenarios: A Comprehensive Review and Comparative Study of State-of-the-Art Multi-Object Tracking Methods

J Alikhanov, H Kim - IEEE Access, 2023 - ieeexplore.ieee.org
Online action detection in surveillance scenarios presents considerable challenges,
particularly due to the dynamically changing environments and real-time processing …

SiamMDM: an adaptive fusion network with dynamic template for real-time satellite video single object tracking

J Yang, Z Pan, Z Wang, B Lei… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Tracking moving targets in satellite videos has attracted wide attention recently. However,
the development of target tracking in satellite videos is much slower than that in general …

Tprnet: camouflaged object detection via transformer-induced progressive refinement network

Q Zhang, Y Ge, C Zhang, H Bi - The Visual Computer, 2023 - Springer
Camouflaged object detection (COD) is a challenging task which aims to detect objects
similar to the surrounding environment. In this paper, we propose a transformer-induced …

Feature aggregated queries for transformer-based video object detectors

Y Cui - Proceedings of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Video object detection needs to solve feature degradation situations that rarely happen in
the image domain. One solution is to use the temporal information and fuse the features from …

Trep: Transformer-based evidential prediction for pedestrian intention with uncertainty

Z Zhang, R Tian, Z Ding - Proceedings of the AAAI Conference on …, 2023 - ojs.aaai.org
With rapid development in hardware (sensors and processors) and AI algorithms, automated
driving techniques have entered the public's daily life and achieved great success in …

Dynamic feature aggregation for efficient video object detection

Y Cui - Proceedings of the Asian Conference on Computer …, 2022 - openaccess.thecvf.com
Video object detection is a fundamental yet challenging task in computer vision. One
practical solution is to take advantage of temporal information from the video and apply …

Dynamic proposals for efficient object detection

Y Cui, L Yang, D Liu - arxiv preprint arxiv:2207.05252, 2022 - arxiv.org
Object detection is a basic computer vision task to loccalize and categorize objects in a
given image. Most state-of-the-art detection methods utilize a fixed number of proposals as …