YOLO-v1 to YOLO-v8, the rise of YOLO and its complementary nature toward digital manufacturing and industrial defect detection

M Hussain - Machines, 2023 - mdpi.com
Since its inception in 2015, the YOLO (You Only Look Once) variant of object detectors has
rapidly grown, with the latest release of YOLO-v8 in January 2023. YOLO variants are …

A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends

J Gui, T Chen, J Zhang, Q Cao, Z Sun… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …

Davit: Dual attention vision transformers

M Ding, B **ao, N Codella, P Luo, J Wang… - European conference on …, 2022 - Springer
In this work, we introduce Dual Attention Vision Transformers (DaViT), a simple yet effective
vision transformer architecture that is able to capture global context while maintaining …

Twins: Revisiting the design of spatial attention in vision transformers

X Chu, Z Tian, Y Wang, B Zhang… - Advances in neural …, 2021 - proceedings.neurips.cc
Very recently, a variety of vision transformer architectures for dense prediction tasks have
been proposed and they show that the design of spatial attention is critical to their success in …

Localvit: Bringing locality to vision transformers

Y Li, K Zhang, J Cao, R Timofte, L Van Gool - arxiv preprint arxiv …, 2021 - arxiv.org
We study how to introduce locality mechanisms into vision transformers. The transformer
network originates from machine translation and is particularly good at modelling long-range …

Transformers in vision: A survey

S Khan, M Naseer, M Hayat, SW Zamir… - ACM computing …, 2022 - dl.acm.org
Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …

Transgan: Two pure transformers can make one strong gan, and that can scale up

Y Jiang, S Chang, Z Wang - Advances in Neural …, 2021 - proceedings.neurips.cc
The recent explosive interest on transformers has suggested their potential to become
powerful``universal" models for computer vision tasks, such as classification, detection, and …

Autoformer: Searching transformers for visual recognition

M Chen, H Peng, J Fu, H Ling - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Recently, pure transformer-based models have shown great potentials for vision tasks such
as image classification and detection. However, the design of transformer networks is …

A survey on visual transformer

K Han, Y Wang, H Chen, X Chen, J Guo, Z Liu… - arxiv preprint arxiv …, 2020 - arxiv.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …