Bridging the gap: A unified video comprehension framework for moment retrieval and highlight detection
Abstract Video Moment Retrieval (MR) and Highlight Detection (HD) have attracted
significant attention due to the growing demand for video analysis. Recent approaches treat …
significant attention due to the growing demand for video analysis. Recent approaches treat …
Mind the interference: Retaining pre-trained knowledge in parameter efficient continual learning of vision-language models
This study addresses the Domain-Class Incremental Learning problem, a realistic but
challenging continual learning scenario where both the domain distribution and target …
challenging continual learning scenario where both the domain distribution and target …
Intelligent electronic components waste detection in complex occlusion environments based on the focusing dynamic channel-you only look once model
The exponential increase in electronic waste has become a major worldwide issue, driven
by the rapid technological advances and the proliferation of the consumer electronics …
by the rapid technological advances and the proliferation of the consumer electronics …
UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment
Image Quality Assessment (IQA) and Image Aesthetic Assessment (IAA) aim to simulate
human subjective perception of image visual quality and aesthetic appeal. Existing methods …
human subjective perception of image visual quality and aesthetic appeal. Existing methods …
GrootVL: Tree Topology is All You Need in State Space Model
The state space models, employing recursively propagated features, demonstrate strong
representation capabilities comparable to Transformer models and superior efficiency …
representation capabilities comparable to Transformer models and superior efficiency …
RPEE-HEADS: A Novel Benchmark for Pedestrian Head Detection in Crowd Videos
The automatic detection of pedestrian heads in crowded environments is essential for crowd
analysis and management tasks, particularly in high-risk settings such as railway platforms …
analysis and management tasks, particularly in high-risk settings such as railway platforms …
Video Object Segmentation with Dynamic Query Modulation
Storing intermediate frame segmentations as memory for long-range context modeling,
spatial-temporal memory-based methods have recently showcased impressive results in …
spatial-temporal memory-based methods have recently showcased impressive results in …
UniTracker: transformer-based CrossUnihead for multi-object tracking
F Wu, Y Zhang - Journal of Real-Time Image Processing, 2024 - Springer
In recent years, tracking-by-detection (TBD) has emerged as the predominant approach for
Multi-object Tracking (MOT). Most TBD algorithms typically employ separate branch heads …
Multi-object Tracking (MOT). Most TBD algorithms typically employ separate branch heads …
BMDCNet: A Satellite Imagery Road Extraction Algorithm based on Multi-level Road Feature
C Wang, J Lu, Z Chen - IEEE Geoscience and Remote Sensing …, 2024 - ieeexplore.ieee.org
Multilevel road feature extraction from remote sensing image plays an important role in
numerous applications such as autonomous driving and urban planning. However …
numerous applications such as autonomous driving and urban planning. However …
IAFI-FCOS: Intra-and across-layer feature interaction FCOS model for lesion detection of CT images
Q Guan, M Pan, F Chen, Z Yang, Z Yu… - … Joint Conference on …, 2024 - ieeexplore.ieee.org
Effective lesion detection in medical image is not only rely on the features of lesion region,
but also deeply relative to the surrounding information. However, most current methods have …
but also deeply relative to the surrounding information. However, most current methods have …