[HTML][HTML] A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas

J Terven, DM Córdova-Esparza… - Machine learning and …, 2023 - mdpi.com
YOLO has become a central real-time object detection system for robotics, driverless cars,
and video monitoring applications. We present a comprehensive analysis of YOLO's …

Object detection using deep learning, CNNs and vision transformers: A review

AB Amjoud, M Amrouch - IEEE Access, 2023 - ieeexplore.ieee.org
Detecting objects remains one of computer vision and image understanding applications'
most fundamental and challenging aspects. Significant advances in object detection have …

Detrs with collaborative hybrid assignments training

Z Zong, G Song, Y Liu - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
In this paper, we provide the observation that too few queries assigned as positive samples
in DETR with one-to-one set matching leads to sparse supervision on the encoder's output …

Tood: Task-aligned one-stage object detection

C Feng, Y Zhong, Y Gao, MR Scott… - 2021 IEEE/CVF …, 2021 - computer.org
One-stage object detection is commonly implemented by optimizing two sub-tasks: object
classification and localization, using heads with two parallel branches, which might lead to a …

Conditional detr for fast training convergence

D Meng, X Chen, Z Fan, G Zeng, H Li… - Proceedings of the …, 2021 - openaccess.thecvf.com
The recently-developed DETR approach applies the transformer encoder and decoder
architecture to object detection and achieves promising performance. In this paper, we …

Yolox: Exceeding yolo series in 2021

Z Ge, S Liu, F Wang, Z Li, J Sun - arxiv preprint arxiv:2107.08430, 2021 - arxiv.org
In this report, we present some experienced improvements to YOLO series, forming a new
high-performance detector--YOLOX. We switch the YOLO detector to an anchor-free manner …

-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

J He, S Erfani, X Ma, J Bailey… - Advances in neural …, 2021 - proceedings.neurips.cc
Bounding box (bbox) regression is a fundamental task in computer vision. So far, the most
commonly used loss functions for bbox regression are the Intersection over Union (IoU) loss …

Learning spatio-temporal transformer for visual tracking

B Yan, H Peng, J Fu, D Wang… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
In this paper, we present a new tracking architecture with an encoder-decoder transformer
as the key component. The encoder models the global spatio-temporal feature …

Scaled-yolov4: Scaling cross stage partial network

CY Wang, A Bochkovskiy… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We show that the YOLOv4 object detection neural network based on the CSP approach,
scales both up and down and is applicable to small and large networks while maintaining …

Deformable detr: Deformable transformers for end-to-end object detection

X Zhu, W Su, L Lu, B Li, X Wang, J Dai - arxiv preprint arxiv:2010.04159, 2020 - arxiv.org
DETR has been recently proposed to eliminate the need for many hand-designed
components in object detection while demonstrating good performance. However, it suffers …