[HTML][HTML] A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas

J Terven, DM Córdova-Esparza… - Machine Learning and …, 2023 - mdpi.com
YOLO has become a central real-time object detection system for robotics, driverless cars,
and video monitoring applications. We present a comprehensive analysis of YOLO's …

Foundation Models Defining a New Era in Vision: a Survey and Outlook

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025 - ieeexplore.ieee.org
Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Grounding dino: Marrying dino with grounded pre-training for open-set object detection

S Liu, Z Zeng, T Ren, F Li, H Zhang, J Yang… - … on Computer Vision, 2024 - Springer
In this paper, we develop an open-set object detector, called Grounding DINO, by marrying
Transformer-based detector DINO with grounded pre-training, which can detect arbitrary …

Yolov9: Learning what you want to learn using programmable gradient information

CY Wang, IH Yeh, HY Mark Liao - European conference on computer …, 2024 - Springer
Today's deep learning methods focus on how to design the objective functions to make the
prediction as close as possible to the target. Meanwhile, an appropriate neural network …

Planning-oriented autonomous driving

Y Hu, J Yang, L Chen, K Li, C Sima… - Proceedings of the …, 2023 - openaccess.thecvf.com
Modern autonomous driving system is characterized as modular tasks in sequential order,
ie, perception, prediction, and planning. In order to perform a wide diversity of tasks and …

Diffusiondet: Diffusion model for object detection

S Chen, P Sun, Y Song, P Luo - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We propose DiffusionDet, a new framework that formulates object detection as a denoising
diffusion process from noisy boxes to object boxes. During the training stage, object boxes …

Seqtrack: Sequence to sequence learning for visual object tracking

X Chen, H Peng, D Wang, H Lu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In this paper, we present a new sequence-to-sequence learning framework for visual
tracking, dubbed SeqTrack. It casts visual tracking as a sequence generation problem …

YOLOv6: A single-stage object detection framework for industrial applications

C Li, L Li, H Jiang, K Weng, Y Geng, L Li, Z Ke… - arxiv preprint arxiv …, 2022 - arxiv.org
For years, the YOLO series has been the de facto industry-level standard for efficient object
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …

Universal instance perception as object discovery and retrieval

B Yan, Y Jiang, J Wu, D Wang, P Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …

YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

CY Wang, A Bochkovskiy… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Real-time object detection is one of the most important research topics in computer vision.
As new approaches regarding architecture optimization and training optimization are …