Single-model and any-modality for video object tracking

Z Wu, J Zheng, X Ren, FA Vasluianu… - Proceedings of the …, 2024 - openaccess.thecvf.com
In the realm of video object tracking auxiliary modalities such as depth thermal or event data
have emerged as valuable assets to complement the RGB trackers. In practice most existing …

Blinkvision: A benchmark for optical flow, scene flow and point tracking estimation using rgb frames and events

Y Li, Y Shen, Z Huang, S Chen, W Bian, X Shi… - … on Computer Vision, 2024 - Springer
Recent advances in event-based vision suggest that they complement traditional cameras
by providing continuous observation without frame rate limitations and high dynamic range …

Muvo: A multimodal generative world model for autonomous driving with geometric representations

D Bogdoll, Y Yang, JM Zöllner - arxiv preprint arxiv:2311.11762, 2023 - arxiv.org
Learning unsupervised world models for autonomous driving has the potential to improve
the reasoning capabilities of today's systems dramatically. However, most work neglects the …

Segment Any Event Streams via Weighted Adaptation of Pivotal Tokens

Z Chen, Z Zhu, Y Zhang, J Hou… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we delve into the nuanced challenge of tailoring the Segment Anything Models
(SAMs) for integration with event data with the overarching objective of attaining robust and …

Efficient Meshflow and Optical Flow Estimation from Event Cameras

X Luo, A Luo, Z Wang, C Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
In this paper we explore the problem of event-based meshflow estimation a novel task that
involves predicting a spatially smooth sparse motion field from event cameras. To start we …

Temporal event stereo via joint learning with stereoscopic flow

H Cho, JY Kang, KJ Yoon - European Conference on Computer Vision, 2024 - Springer
Event cameras are dynamic vision sensors inspired by the biological retina, characterized
by their high dynamic range, high temporal resolution, and low power consumption. These …

Bring Event into RGB and LiDAR: Hierarchical Visual-Motion Fusion for Scene Flow

H Zhou, Y Chang, Z Shi - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Single RGB or LiDAR is the mainstream sensor for the challenging scene flow which relies
heavily on visual features to match motion features. Compared with single modality existing …

[HTML][HTML] High-Performance Grape Disease Detection Method Using Multimodal Data and Parallel Activation Functions

R Li, J Liu, B Shi, H Zhao, Y Li, X Zheng, C Peng, C Lv - Plants, 2024 - pmc.ncbi.nlm.nih.gov
This paper introduces a novel deep learning model for grape disease detection that
integrates multimodal data and parallel heterogeneous activation functions, significantly …

Steering Prediction via a Multi-Sensor System for Autonomous Racing

Z Zhou, Z Wu, F Bolli, R Boutteau, F Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Autonomous racing has rapidly gained research attention. Traditionally, racing cars rely on
2D LiDAR as their primary visual system. In this work, we explore the integration of an event …

Video Frame Prediction from a Single Image and Events

J Zhu, Z Wan, Y Dai - Proceedings of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Recently, the task of Video Frame Prediction (VFP), which predicts future video frames from
previous ones through extrapolation, has made remarkable progress. However, the …