A survey on visual mamba

H Zhang, Y Zhu, D Wang, L Zhang, T Chen, Z Wang… - Applied Sciences, 2024 - mdpi.com
State space models (SSM) with selection mechanisms and hardware-aware architectures,
namely Mamba, have recently shown significant potential in long-sequence modeling. Since …

Mamba-360: Survey of state space models as transformer alternative for long sequence modelling: Methods, applications, and challenges

BN Patro, VS Agneeswaran - arxiv preprint arxiv:2404.16112, 2024 - arxiv.org
Sequence modeling is a crucial area across various domains, including Natural Language
Processing (NLP), speech recognition, time series forecasting, music generation, and …

Is mamba effective for time series forecasting?

Z Wang, F Kong, S Feng, M Wang, X Yang, H Zhao… - Neurocomputing, 2025 - Elsevier
In the realm of time series forecasting (TSF), it is imperative for models to adeptly discern
and distill hidden patterns within historical time series data to forecast future states …

A survey of mamba

H Qu, L Ning, R An, W Fan, T Derr, H Liu, X Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
As one of the most representative DL techniques, Transformer architecture has empowered
numerous advanced models, especially the large language models (LLMs) that comprise …

The hidden attention of mamba models

A Ali, I Zimerman, L Wolf - arxiv preprint arxiv:2403.01590, 2024 - arxiv.org
The Mamba layer offers an efficient selective state space model (SSM) that is highly effective
in modeling multiple domains including NLP, long-range sequences processing, and …

Pointrwkv: Efficient rwkv-like model for hierarchical point cloud learning

Q He, J Zhang, J Peng, H He, X Li, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Transformers have revolutionized the point cloud learning task, but the quadratic complexity
hinders its extension to long sequence and makes a burden on limited computational …

A survey on efficient inference for large language models

Z Zhou, X Ning, K Hong, T Fu, J Xu, S Li, Y Lou… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …

State space model for new-generation network alternative to transformers: A survey

X Wang, S Wang, Y Ding, Y Li, W Wu, Y Rong… - arxiv preprint arxiv …, 2024 - arxiv.org
In the post-deep learning era, the Transformer architecture has demonstrated its powerful
performance across pre-trained big models and various downstream tasks. However, the …

Zamba: A Compact 7B SSM Hybrid Model

P Glorioso, Q Anthony, Y Tokpanov… - arxiv preprint arxiv …, 2024 - arxiv.org
In this technical report, we present Zamba, a novel 7B SSM-transformer hybrid model which
achieves competitive performance against leading open-weight models at a comparable …

Inference optimization of foundation models on ai accelerators

Y Park, K Budhathoki, L Chen, JM Kübler… - Proceedings of the 30th …, 2024 - dl.acm.org
Powerful foundation models, including large language models (LLMs), with Transformer
architectures have ushered in a new era of Generative AI across various industries. Industry …