Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes
We present DrivingGaussian an efficient and effective framework for surrounding dynamic
autonomous driving scenes. For complex scenes with moving objects we first sequentially …
autonomous driving scenes. For complex scenes with moving objects we first sequentially …
A survey on open-vocabulary detection and segmentation: Past, present, and future
As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …
have made tremendous progress in deep learning era. Due to the expensive manual …
Brushnet: A plug-and-play image inpainting model with decomposed dual-branch diffusion
Image inpainting, the process of restoring corrupted images, has seen significant
advancements with the advent of diffusion models (DMs). Despite these advancements …
advancements with the advent of diffusion models (DMs). Despite these advancements …
A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities
Biomedical image analysis is fundamental for biomedical discovery. Holistic image analysis
comprises interdependent subtasks such as segmentation, detection and recognition, which …
comprises interdependent subtasks such as segmentation, detection and recognition, which …
Diffusion model-based video editing: A survey
The rapid development of diffusion models (DMs) has significantly advanced image and
video applications, making" what you want is what you see" a reality. Among these, video …
video applications, making" what you want is what you see" a reality. Among these, video …
Segment anything in medical images and videos: Benchmark and deployment
Recent advances in segmentation foundation models have enabled accurate and efficient
segmentation across a wide range of natural images and videos, but their utility to medical …
segmentation across a wide range of natural images and videos, but their utility to medical …
T2v-compbench: A comprehensive benchmark for compositional text-to-video generation
Text-to-video (T2V) generation models have advanced significantly, yet their ability to
compose different objects, attributes, actions, and motions into a video remains unexplored …
compose different objects, attributes, actions, and motions into a video remains unexplored …
A survey on occupancy perception for autonomous driving: The information fusion perspective
Abstract 3D occupancy perception technology aims to observe and understand dense 3D
environments for autonomous vehicles. Owing to its comprehensive perception capability …
environments for autonomous vehicles. Owing to its comprehensive perception capability …
Filo: Zero-shot anomaly detection by fine-grained description and high-quality localization
Zero-shot anomaly detection (ZSAD) methods detect anomalies without prior access to
known normal or abnormal samples within target categories. Existing methods typically rely …
known normal or abnormal samples within target categories. Existing methods typically rely …
Spatialbot: Precise spatial understanding with vision language models
Vision Language Models (VLMs) have achieved impressive performance in 2D image
understanding, however they are still struggling with spatial understanding which is the …
understanding, however they are still struggling with spatial understanding which is the …