Advances in medical image analysis with vision transformers: a comprehensive review
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …
has recently also triggered broad interest in Computer Vision. Among other merits …
Efficient acceleration of deep learning inference on resource-constrained edge devices: A review
Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted
in breakthroughs in many areas. However, deploying these highly accurate models for data …
in breakthroughs in many areas. However, deploying these highly accurate models for data …
Efficientvit: Memory efficient vision transformer with cascaded group attention
Vision transformers have shown great success due to their high model capabilities.
However, their remarkable performance is accompanied by heavy computation costs, which …
However, their remarkable performance is accompanied by heavy computation costs, which …
Run, don't walk: chasing higher FLOPS for faster neural networks
To design fast neural networks, many works have been focusing on reducing the number of
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …
floating-point operations (FLOPs). We observe that such reduction in FLOPs, however, does …
Designing network design strategies through gradient path analysis
Designing a high-efficiency and high-quality expressive network architecture has always
been the most important research topic in the field of deep learning. Most of today's network …
been the most important research topic in the field of deep learning. Most of today's network …
Rethinking vision transformers for mobilenet size and speed
With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to
optimize the performance and complexity of ViTs to enable efficient deployment on mobile …
optimize the performance and complexity of ViTs to enable efficient deployment on mobile …
Lite-mono: A lightweight cnn and transformer architecture for self-supervised monocular depth estimation
Self-supervised monocular depth estimation that does not require ground truth for training
has attracted attention in recent years. It is of high interest to design lightweight but effective …
has attracted attention in recent years. It is of high interest to design lightweight but effective …
Scale-aware modulation meet transformer
This paper presents a new vision Transformer, Scale Aware Modulation Transformer (SMT),
that can handle various downstream tasks efficiently by combining the convolutional network …
that can handle various downstream tasks efficiently by combining the convolutional network …
Transformer-based visual segmentation: A survey
Visual segmentation seeks to partition images, video frames, or point clouds into multiple
segments or groups. This technique has numerous real-world applications, such as …
segments or groups. This technique has numerous real-world applications, such as …
Swiftformer: Efficient additive attention for transformer-based real-time mobile vision applications
Self-attention has become a defacto choice for capturing global context in various vision
applications. However, its quadratic computational complexity with respect to image …
applications. However, its quadratic computational complexity with respect to image …