A survey of modern deep learning based object detection models
Object Detection is the task of classification and localization of objects in an image or video.
It has gained prominence in recent years due to its widespread applications. This article …
It has gained prominence in recent years due to its widespread applications. This article …
[HTML][HTML] Graph neural networks: A review of methods and applications
Lots of learning tasks require dealing with graph data which contains rich relation
information among elements. Modeling physics systems, learning molecular fingerprints …
information among elements. Modeling physics systems, learning molecular fingerprints …
Yolov9: Learning what you want to learn using programmable gradient information
Today's deep learning methods focus on how to design the objective functions to make the
prediction as close as possible to the target. Meanwhile, an appropriate neural network …
prediction as close as possible to the target. Meanwhile, an appropriate neural network …
YOLOv6: A single-stage object detection framework for industrial applications
For years, the YOLO series has been the de facto industry-level standard for efficient object
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …
detection. The YOLO community has prospered overwhelmingly to enrich its use in a …
A convnet for the 2020s
The" Roaring 20s" of visual recognition began with the introduction of Vision Transformers
(ViTs), which quickly superseded ConvNets as the state-of-the-art image classification …
(ViTs), which quickly superseded ConvNets as the state-of-the-art image classification …
A generalist agent
Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …
towards building a single generalist agent beyond the realm of text outputs. The agent …
Zero-shot text-to-image generation
Text-to-image generation has traditionally focused on finding better modeling assumptions
for training on a fixed dataset. These assumptions might involve complex architectures …
for training on a fixed dataset. These assumptions might involve complex architectures …
nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation
Biomedical imaging is a driver of scientific discovery and a core component of medical care
and is being stimulated by the field of deep learning. While semantic segmentation …
and is being stimulated by the field of deep learning. While semantic segmentation …
Simam: A simple, parameter-free attention module for convolutional neural networks
In this paper, we propose a conceptually simple but very effective attention module for
Convolutional Neural Networks (ConvNets). In contrast to existing channel-wise and spatial …
Convolutional Neural Networks (ConvNets). In contrast to existing channel-wise and spatial …
Mvitv2: Improved multiscale vision transformers for classification and detection
In this paper, we study Multiscale Vision Transformers (MViTv2) as a unified architecture for
image and video classification, as well as object detection. We present an improved version …
image and video classification, as well as object detection. We present an improved version …