- Academic Search

X Zhou, Z Lin, X Shan, Y Wang… - Proceedings of the …, 2024 - openaccess.thecvf.com

We present DrivingGaussian an efficient and effective framework for surrounding dynamic
autonomous driving scenes. For complex scenes with moving objects we first sequentially …

Speichern Zitieren Zitiert von: 142 Ähnliche Artikel Alle 3 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

A survey on open-vocabulary detection and segmentation: Past, present, and future

C Zhu, L Chen - IEEE Transactions on Pattern Analysis and …, 2024 - ieeexplore.ieee.org

As the most fundamental scene understanding tasks, object detection and segmentation
have made tremendous progress in deep learning era. Due to the expensive manual …

Speichern Zitieren Zitiert von: 24 Ähnliche Artikel Alle 7 Versionen

[Free GPT-4]

[PDF] arxiv.org

Brushnet: A plug-and-play image inpainting model with decomposed dual-branch diffusion

X Ju, X Liu, X Wang, Y Bian, Y Shan, Q Xu - European Conference on …, 2024 - Springer

Image inpainting, the process of restoring corrupted images, has seen significant
advancements with the advent of diffusion models (DMs). Despite these advancements …

Speichern Zitieren Zitiert von: 27 Ähnliche Artikel Alle 2 Versionen

A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities

T Zhao, Y Gu, J Yang, N Usuyama, HH Lee, S Kiblawi… - Nature …, 2024 - nature.com

Biomedical image analysis is fundamental for biomedical discovery. Holistic image analysis
comprises interdependent subtasks such as segmentation, detection and recognition, which …

Speichern Zitieren Zitiert von: 8 Ähnliche Artikel Alle 5 Versionen

[Free GPT-4]

[PDF] arxiv.org

Diffusion model-based video editing: A survey

W Sun, RC Tu, J Liao, D Tao - arxiv preprint arxiv:2407.07111, 2024 - arxiv.org

The rapid development of diffusion models (DMs) has significantly advanced image and
video applications, making" what you want is what you see" a reality. Among these, video …

Speichern Zitieren Zitiert von: 10 Ähnliche Artikel HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Segment anything in medical images and videos: Benchmark and deployment

J Ma, S Kim, F Li, M Baharoon, R Asakereh… - arxiv preprint arxiv …, 2024 - arxiv.org

Recent advances in segmentation foundation models have enabled accurate and efficient
segmentation across a wide range of natural images and videos, but their utility to medical …

Speichern Zitieren Zitiert von: 34 Ähnliche Artikel HTML-Version

[Free GPT-4]

[PDF] arxiv.org

T2v-compbench: A comprehensive benchmark for compositional text-to-video generation

K Sun, K Huang, X Liu, Y Wu, Z Xu, Z Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Text-to-video (T2V) generation models have advanced significantly, yet their ability to
compose different objects, attributes, actions, and motions into a video remains unexplored …

Speichern Zitieren Zitiert von: 17 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

A survey on occupancy perception for autonomous driving: The information fusion perspective

H Xu, J Chen, S Meng, Y Wang, LP Chau - Information Fusion, 2025 - Elsevier

Abstract 3D occupancy perception technology aims to observe and understand dense 3D
environments for autonomous vehicles. Owing to its comprehensive perception capability …

Speichern Zitieren Zitiert von: 10 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] acm.org

Filo: Zero-shot anomaly detection by fine-grained description and high-quality localization

Z Gu, B Zhu, G Zhu, Y Chen, H Li, M Tang… - Proceedings of the 32nd …, 2024 - dl.acm.org

Zero-shot anomaly detection (ZSAD) methods detect anomalies without prior access to
known normal or abnormal samples within target categories. Existing methods typically rely …

Speichern Zitieren Zitiert von: 13 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] arxiv.org

Spatialbot: Precise spatial understanding with vision language models

W Cai, I Ponomarenko, J Yuan, X Li, W Yang… - arxiv preprint arxiv …, 2024 - arxiv.org

Vision Language Models (VLMs) have achieved impressive performance in 2D image
understanding, however they are still struggling with spatial understanding which is the …

Speichern Zitieren Zitiert von: 12 Ähnliche Artikel HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Grounded sam: Assembling open-world models for diverse visual tasks

Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes

A survey on open-vocabulary detection and segmentation: Past, present, and future

Brushnet: A plug-and-play image inpainting model with decomposed dual-branch diffusion

A foundation model for joint segmentation, detection and recognition of biomedical objects across nine modalities

Diffusion model-based video editing: A survey

Segment anything in medical images and videos: Benchmark and deployment

T2v-compbench: A comprehensive benchmark for compositional text-to-video generation

A survey on occupancy perception for autonomous driving: The information fusion perspective

Filo: Zero-shot anomaly detection by fine-grained description and high-quality localization

Spatialbot: Precise spatial understanding with vision language models