Rank-DETR for high quality object detection

Y Pu, W Liang, Y Hao, Y Yuan… - Advances in …, 2024 - proceedings.neurips.cc
Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …

Bam-detr: Boundary-aligned moment detection transformer for temporal sentence grounding in videos

P Lee, H Byun - European Conference on Computer Vision, 2024 - Springer
Temporal sentence grounding aims to localize moments relevant to a language description.
Recently, DETR-like approaches achieved notable progress by predicting the center and …

Exploring plain vit reconstruction for multi-class unsupervised anomaly detection

J Zhang, X Chen, Y Wang, C Wang, Y Liu, X Li… - arxiv preprint arxiv …, 2023 - arxiv.org
This work studies the recently proposed challenging and practical Multi-class Unsupervised
Anomaly Detection (MUAD) task, which only requires normal images for training while …

Hybrid Proposal Refiner: Revisiting DETR Series from the Faster R-CNN Perspective

J Zhao, F Wei, C Xu - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
With the transformative impact of the Transformer DETR pioneered the application of the
encoder-decoder architecture to object detection. A collection of follow-up research eg …

AugDETR: Improving Multi-scale Learning for Detection Transformer

J Dong, Y Lin, C Li, S Zhou, N Zheng - European Conference on Computer …, 2024 - Springer
Current end-to-end detectors typically exploit transformers to detect objects and show
promising performance. Among them, Deformable DETR is a representative paradigm that …

A Graph-Based Approach for Category-Agnostic Pose Estimation

O Hirschorn, S Avidan - European Conference on Computer Vision, 2024 - Springer
Traditional 2D pose estimation models are limited by their category-specific design, making
them suitable only for predefined object categories. This restriction becomes particularly …

Vision transformer off-the-shelf: A surprising baseline for few-shot class-agnostic counting

Z Wang, L **ao, Z Cao, H Lu - Proceedings of the AAAI Conference on …, 2024 - ojs.aaai.org
Class-agnostic counting (CAC) aims to count objects of interest from a query image given
few exemplars. This task is typically addressed by extracting the features of query image and …

QR-DETR: Query Routing for Detection Transformer

T Senthivel, NS Vu - … of the Asian Conference on Computer …, 2024 - openaccess.thecvf.com
Detection Transformer (DETR) predicts object bounding boxes and classes from learned
object queries. However, DETR exhibits three major flaws:(1) Only a subset of object queries …

PoIFusion: Multi-Modal 3D Object Detection via Fusion at Points of Interest

J Deng, S Zhang, F Dayoub, W Ouyang… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we present PoIFusion, a simple yet effective multi-modal 3D object detection
framework to fuse the information of RGB images and LiDAR point clouds at the point of …

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection

Q Chen, X Su, X Zhang, J Wang, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
In this paper, we present a light-weight detection transformer, LW-DETR, which outperforms
YOLOs for real-time object detection. The architecture is a simple stack of a ViT encoder, a …