Lidar-llm: Exploring the potential of large language models for 3d lidar understanding
Recently, Large Language Models (LLMs) and Multimodal Large Language Models
(MLLMs) have shown promise in instruction following and 2D image understanding. While …
(MLLMs) have shown promise in instruction following and 2D image understanding. While …
Edgesam: Prompt-in-the-loop distillation for on-device deployment of sam
This paper presents EdgeSAM, an accelerated variant of the Segment Anything Model
(SAM), optimized for efficient execution on edge devices with minimal compromise in …
(SAM), optimized for efficient execution on edge devices with minimal compromise in …
Stream Query Denoising for Vectorized HD-Map Construction
This paper introduces the Stream Query Denoising (SQD) strategy, a novel and general
approach for high-definition map (HD-map) construction. SQD is designed to improve the …
approach for high-definition map (HD-map) construction. SQD is designed to improve the …
SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
Visual grounding is a common vision task that involves grounding descriptive sentences to
the corresponding regions of an image. Most existing methods use independent image-text …
the corresponding regions of an image. Most existing methods use independent image-text …
Empowering lightweight detectors: Orientation Distillation via anti-ambiguous spatial transformation for remote sensing images
Abstract Knowledge distillation (KD) has been one of the most potential methods to
implement a lightweight detector, which plays a significant role in satellite in-orbit processing …
implement a lightweight detector, which plays a significant role in satellite in-orbit processing …
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
Y Wang, X Li, S Weng, G Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
DETR is a novel end-to-end transformer architecture object detector which significantly
outperforms classic detectors when scaling up. In this paper we focus on the compression of …
outperforms classic detectors when scaling up. In this paper we focus on the compression of …
[HTML][HTML] Computer vision model compression techniques for embedded systems: A survey
Deep neural networks have consistently represented the state of the art in most computer
vision problems. In these scenarios, larger and more complex models have demonstrated …
vision problems. In these scenarios, larger and more complex models have demonstrated …
D-FINE: redefine regression Task in DETRs as Fine-grained distribution refinement
We introduce D-FINE, a powerful real-time object detector that achieves outstanding
localization precision by redefining the bounding box regression task in DETR models. D …
localization precision by redefining the bounding box regression task in DETR models. D …
Distilling Knowledge from Large-Scale Image Models for Object Detection
Large-scale image models have made great progress in recent years, pushing the
boundaries of many vision tasks, eg, object detection. Considering that deploying large …
boundaries of many vision tasks, eg, object detection. Considering that deploying large …
DHS-DETR: Efficient DETRs with dynamic head switching
Detection Transformer (DETR) and its variants have emerged a new paradigm to object
detection, but their high computational cost hinders practical applications. By investigating …
detection, but their high computational cost hinders practical applications. By investigating …