Rank-DETR for high quality object detection
Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …
bounding boxes, sort them by their classification confidence scores, and select the top …
Grounded sam: Assembling open-world models for diverse visual tasks
We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to
combine with the segment anything model (SAM). This integration enables the detection and …
combine with the segment anything model (SAM). This integration enables the detection and …
Dfa3d: 3d deformable attention for 2d-to-3d feature lifting
In this paper, we propose a new operator, called 3D DeFormable Attention (DFA3D), for 2D-
to-3D feature lifting, which transforms multi-view 2D image features into a unified 3D space …
to-3D feature lifting, which transforms multi-view 2D image features into a unified 3D space …
Few-Shot Object Detection with Foundation Models
Few-shot object detection (FSOD) aims to detect objects with only a few training examples.
Visual feature extraction and query-support similarity learning are the two critical …
Visual feature extraction and query-support similarity learning are the two critical …
[HTML][HTML] Rethinking detection based table structure recognition for visually rich document images
Detection models have been extensively employed for the Table Structure Recognition
(TSR) task, aiming to convert table images into structured formats by detecting table …
(TSR) task, aiming to convert table images into structured formats by detecting table …
Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer
Emotion recognition aims to discern the emotional state of subjects within an image, relying
on subject-centric and contextual visual cues. Current approaches typically follow a two …
on subject-centric and contextual visual cues. Current approaches typically follow a two …
[PDF][PDF] MLP-DINO: category modeling and query graphing with deep MLP for object detection
Popular transformer-based detectors detect objects in a one-to-one manner, where both the
bounding box and category of each object are predicted only by the single query, leading to …
bounding box and category of each object are predicted only by the single query, leading to …
[HTML][HTML] A comparison of transformer and CNN-based object detection models for surface defects on Li-Ion Battery Electrodes
A Mattern, H Gerdes, D Grunert, RH Schmitt - Journal of Energy Storage, 2025 - Elsevier
Deep learning-based defect detection approaches offer great potential for end-to-end
surface defect detection. After the prevalent Convolutional Neural Network (CNN) models …
surface defect detection. After the prevalent Convolutional Neural Network (CNN) models …
Transformer-based End-to-End Object Detection in Aerial Images
Transformer models have achieved significant mile-stones in the field of Artificial Intelligence
in recent years, primarily focusing on text processing and natural language processing …
in recent years, primarily focusing on text processing and natural language processing …
A lightweight Transformer model for defect detection in electroluminescence images of photovoltaic cells
Y Yang, J Zhang, X Shu, L Pan, M Zhang - IEEE Access, 2024 - ieeexplore.ieee.org
Solar panels play a crucial role in converting solar energy into electricity, with PhotoVoltaic
(PV) modules being their core components. To ensure solar panels function well, efficient …
(PV) modules being their core components. To ensure solar panels function well, efficient …