Transforming medical imaging with Transformers? A comparative review of key properties, current progresses, and future perspectives

J Li, J Chen, Y Tang, C Wang, BA Landman… - Medical image …, 2023 - Elsevier
Transformer, one of the latest technological advances of deep learning, has gained
prevalence in natural language processing or computer vision. Since medical imaging bear …

Mamba: Linear-time sequence modeling with selective state spaces

A Gu, T Dao - arxiv preprint arxiv:2312.00752, 2023 - arxiv.org
Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

A survey of techniques for optimizing transformer inference

KT Chitty-Venkata, S Mittal, M Emani… - Journal of Systems …, 2023 - Elsevier
Recent years have seen a phenomenal rise in the performance and applications of
transformer neural networks. The family of transformer networks, including Bidirectional …

Flatten transformer: Vision transformer using focused linear attention

D Han, X Pan, Y Han, S Song… - Proceedings of the …, 2023 - openaccess.thecvf.com
The quadratic computation complexity of self-attention has been a persistent challenge
when applying Transformer models to vision tasks. Linear attention, on the other hand, offers …

Structure-aware transformer for graph representation learning

D Chen, L O'Bray, K Borgwardt - … Conference on Machine …, 2022 - proceedings.mlr.press
The Transformer architecture has gained growing attention in graph representation learning
recently, as it naturally overcomes several limitations of graph neural networks (GNNs) by …

Spike-driven transformer

M Yao, J Hu, Z Zhou, L Yuan, Y Tian… - Advances in neural …, 2024 - proceedings.neurips.cc
Abstract Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option
due to their unique spike-based event-driven (ie, spike-driven) paradigm. In this paper, we …

Spikformer: When spiking neural network meets transformer

Z Zhou, Y Zhu, C He, Y Wang, S Yan, Y Tian… - arxiv preprint arxiv …, 2022 - arxiv.org
We consider two biologically plausible structures, the Spiking Neural Network (SNN) and the
self-attention mechanism. The former offers an energy-efficient and event-driven paradigm …

Hierarchically gated recurrent neural network for sequence modeling

Z Qin, S Yang, Y Zhong - Advances in Neural Information …, 2024 - proceedings.neurips.cc
Transformers have surpassed RNNs in popularity due to their superior abilities in parallel
training and long-term dependency modeling. Recently, there has been a renewed interest …

Mb-taylorformer: Multi-branch efficient transformer expanded by taylor formula for image dehazing

Y Qiu, K Zhang, C Wang, W Luo… - Proceedings of the …, 2023 - openaccess.thecvf.com
In recent years, Transformer networks are beginning to replace pure convolutional neural
networks (CNNs) in the field of computer vision due to their global receptive field and …

Long range language modeling via gated state spaces

H Mehta, A Gupta, A Cutkosky, B Neyshabur - arxiv preprint arxiv …, 2022 - arxiv.org
State space models have shown to be effective at modeling long range dependencies,
specially on sequence classification tasks. In this work we focus on autoregressive …