A survey on video diffusion models
The recent wave of AI-generated content (AIGC) has witnessed substantial success in
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
computer vision, with the diffusion model playing a crucial role in this achievement. Due to …
[HTML][HTML] Review of image classification algorithms based on convolutional neural networks
L Chen, S Li, Q Bai, J Yang, S Jiang, Y Miao - Remote Sensing, 2021 - mdpi.com
Image classification has always been a hot research direction in the world, and the
emergence of deep learning has promoted the development of this field. Convolutional …
emergence of deep learning has promoted the development of this field. Convolutional …
Vision mamba: Efficient visual representation learning with bidirectional state space model
Recently the state space models (SSMs) with efficient hardware-aware designs, ie, the
Mamba deep learning model, have shown great potential for long sequence modeling …
Mamba deep learning model, have shown great potential for long sequence modeling …
Segment anything model for medical image analysis: an experimental study
Training segmentation models for medical images continues to be challenging due to the
limited availability of data annotations. Segment Anything Model (SAM) is a foundation …
limited availability of data annotations. Segment Anything Model (SAM) is a foundation …
Segnext: Rethinking convolutional attention design for semantic segmentation
We present SegNeXt, a simple convolutional network architecture for semantic
segmentation. Recent transformer-based models have dominated the field of se-mantic …
segmentation. Recent transformer-based models have dominated the field of se-mantic …
Bevfusion: Multi-task multi-sensor fusion with unified bird's-eye view representation
Multi-sensor fusion is essential for an accurate and reliable autonomous driving system.
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …
Recent approaches are based on point-level fusion: augmenting the LiDAR point cloud with …
Scaling up your kernels to 31x31: Revisiting large kernel design in cnns
We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …
recent advances in vision transformers (ViTs), in this paper, we demonstrate that using a few …
Visual attention network
While originally designed for natural language processing tasks, the self-attention
mechanism has recently taken various computer vision areas by storm. However, the 2D …
mechanism has recently taken various computer vision areas by storm. However, the 2D …
An effective CNN and Transformer complementary network for medical image segmentation
F Yuan, Z Zhang, Z Fang - Pattern Recognition, 2023 - Elsevier
The Transformer network was originally proposed for natural language processing. Due to
its powerful representation ability for long-range dependency, it has been extended for …
its powerful representation ability for long-range dependency, it has been extended for …
PIDNet: A real-time semantic segmentation network inspired by PID controllers
Two-branch network architecture has shown its efficiency and effectiveness in real-time
semantic segmentation tasks. However, direct fusion of high-resolution details and low …
semantic segmentation tasks. However, direct fusion of high-resolution details and low …