- Academic Search

[HTML][HTML] A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas

J Terven, DM Córdova-Esparza… - Machine Learning and …, 2023 - mdpi.com

YOLO has become a central real-time object detection system for robotics, driverless cars,
and video monitoring applications. We present a comprehensive analysis of YOLO's …

保存引用被引用次数：1895 相关文章所有 6 个版本网页快照

[Free GPT-4]

[PDF] mdpi.com

Theoretical understanding of convolutional neural network: Concepts, architectures, applications, future directions

MM Taye - Computation, 2023 - mdpi.com

Convolutional neural networks (CNNs) are one of the main types of neural networks used for
image recognition and classification. CNNs have several uses, some of which are object …

保存引用被引用次数：338 相关文章所有 4 个版本网页快照

[Free GPT-4]

[PDF] arxiv.org

Yolov9: Learning what you want to learn using programmable gradient information

CY Wang, IH Yeh, HY Mark Liao - European conference on computer …, 2024 - Springer

Today's deep learning methods focus on how to design the objective functions to make the
prediction as close as possible to the target. Meanwhile, an appropriate neural network …

保存引用被引用次数：1365 相关文章所有 3 个版本

[Free GPT-4]

[PDF] thecvf.com

Open-vocabulary panoptic segmentation with text-to-image diffusion models

J Xu, S Liu, A Vahdat, W Byeon… - Proceedings of the …, 2023 - openaccess.thecvf.com

We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …

保存引用被引用次数：424 相关文章所有 6 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

Sam 2: Segment anything in images and videos

N Ravi, V Gabeur, YT Hu, R Hu, C Ryali, T Ma… - arxiv preprint arxiv …, 2024 - arxiv.org

We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …

保存引用被引用次数：385 相关文章 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Convnext v2: Co-designing and scaling convnets with masked autoencoders

S Woo, S Debnath, R Hu, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Driven by improved architectures and better representation learning frameworks, the field of
visual recognition has enjoyed rapid modernization and performance boost in the early …

保存引用被引用次数：677 相关文章所有 8 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Diffusiondet: Diffusion model for object detection

S Chen, P Sun, Y Song, P Luo - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

We propose DiffusionDet, a new framework that formulates object detection as a denoising
diffusion process from noisy boxes to object boxes. During the training stage, object boxes …

保存引用被引用次数：485 相关文章所有 5 个版本 HTML 版

[Free GPT-4]

[PDF] arxiv.org

YOLOv6: A single-stage object detection framework for industrial applications

C Li, L Li, H Jiang, K Weng, Y Geng, L Li, Z Ke… - arxiv preprint arxiv …, 2022 - arxiv.org

For years, the YOLO series has been the de facto industry-level standard for efficient object
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …

保存引用被引用次数：2595 相关文章所有 3 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Voxformer: Sparse voxel transformer for camera-based 3d semantic scene completion

Y Li, Z Yu, C Choy, C **ao, JM Alvarez… - Proceedings of the …, 2023 - openaccess.thecvf.com

Humans can easily imagine the complete 3D geometry of occluded objects and scenes. This
appealing ability is vital for recognition and understanding. To enable such capability in AI …

保存引用被引用次数：218 相关文章所有 10 个版本 HTML 版

[Free GPT-4]

[PDF] thecvf.com

Universal instance perception as object discovery and retrieval

B Yan, Y Jiang, J Wu, D Wang, P Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com

All instance perception tasks aim at finding certain objects specified by some queries such
as category names, language expressions, and target annotations, but this complete field …

保存引用被引用次数：165 相关文章所有 5 个版本 HTML 版

创建快讯

引用

高级搜索

已保存到“我的图书馆”

Feature pyramid networks for object detection

[HTML][HTML] A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas

Theoretical understanding of convolutional neural network: Concepts, architectures, applications, future directions

Yolov9: Learning what you want to learn using programmable gradient information

Open-vocabulary panoptic segmentation with text-to-image diffusion models

Sam 2: Segment anything in images and videos

Convnext v2: Co-designing and scaling convnets with masked autoencoders

Diffusiondet: Diffusion model for object detection

YOLOv6: A single-stage object detection framework for industrial applications

Voxformer: Sparse voxel transformer for camera-based 3d semantic scene completion

Universal instance perception as object discovery and retrieval