Rank-DETR for high quality object detection

Y Pu, W Liang, Y Hao, Y Yuan… - Advances in …, 2024 - proceedings.neurips.cc
Modern detection transformers (DETRs) use a set of object queries to predict a list of
bounding boxes, sort them by their classification confidence scores, and select the top …

Grounded sam: Assembling open-world models for diverse visual tasks

T Ren, S Liu, A Zeng, J Lin, K Li, H Cao, J Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to
combine with the segment anything model (SAM). This integration enables the detection and …

Dfa3d: 3d deformable attention for 2d-to-3d feature lifting

H Li, H Zhang, Z Zeng, S Liu, F Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we propose a new operator, called 3D DeFormable Attention (DFA3D), for 2D-
to-3D feature lifting, which transforms multi-view 2D image features into a unified 3D space …

Few-Shot Object Detection with Foundation Models

G Han, SN Lim - Proceedings of the IEEE/CVF Conference …, 2024 - openaccess.thecvf.com
Few-shot object detection (FSOD) aims to detect objects with only a few training examples.
Visual feature extraction and query-support similarity learning are the two critical …

[HTML][HTML] Rethinking detection based table structure recognition for visually rich document images

B **ao, M Simsek, B Kantarci, AA Alkheir - Expert Systems with Applications, 2025 - Elsevier
Detection models have been extensively employed for the Table Structure Recognition
(TSR) task, aiming to convert table images into structured formats by detecting table …

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer

X Li, T Wang, J Zhao, S Mao, J Wang, F Zheng… - Proceedings of the …, 2024 - dl.acm.org
Emotion recognition aims to discern the emotional state of subjects within an image, relying
on subject-centric and contextual visual cues. Current approaches typically follow a two …

[PDF][PDF] MLP-DINO: category modeling and query graphing with deep MLP for object detection

G Cao, W Huang, X Lan, J Zhang, D Jiang… - Proceedings of the Thirty …, 2024 - ijcai.org
Popular transformer-based detectors detect objects in a one-to-one manner, where both the
bounding box and category of each object are predicted only by the single query, leading to …

[HTML][HTML] A comparison of transformer and CNN-based object detection models for surface defects on Li-Ion Battery Electrodes

A Mattern, H Gerdes, D Grunert, RH Schmitt - Journal of Energy Storage, 2025 - Elsevier
Deep learning-based defect detection approaches offer great potential for end-to-end
surface defect detection. After the prevalent Convolutional Neural Network (CNN) models …

Transformer-based End-to-End Object Detection in Aerial Images

ND Vo, L Nguyen, G Ngo, D Du, L Do… - … Journal of Advanced …, 2023 - search.proquest.com
Transformer models have achieved significant mile-stones in the field of Artificial Intelligence
in recent years, primarily focusing on text processing and natural language processing …

A lightweight Transformer model for defect detection in electroluminescence images of photovoltaic cells

Y Yang, J Zhang, X Shu, L Pan, M Zhang - IEEE Access, 2024 - ieeexplore.ieee.org
Solar panels play a crucial role in converting solar energy into electricity, with PhotoVoltaic
(PV) modules being their core components. To ensure solar panels function well, efficient …