- Academic Search

H Zhang, Y Zhu, D Wang, L Zhang, T Chen, Z Wang… - Applied Sciences, 2024 - mdpi.com

State space models (SSM) with selection mechanisms and hardware-aware architectures,
namely Mamba, have recently shown significant potential in long-sequence modeling. Since …

Salva Cita Citato da 32 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Mamba-360: Survey of state space models as transformer alternative for long sequence modelling: Methods, applications, and challenges

BN Patro, VS Agneeswaran - arxiv preprint arxiv:2404.16112, 2024 - arxiv.org

Sequence modeling is a crucial area across various domains, including Natural Language
Processing (NLP), speech recognition, time series forecasting, music generation, and …

Salva Cita Citato da 50 Articoli correlati Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Is mamba effective for time series forecasting?

Z Wang, F Kong, S Feng, M Wang, X Yang, H Zhao… - Neurocomputing, 2025 - Elsevier

In the realm of time series forecasting (TSF), it is imperative for models to adeptly discern
and distill hidden patterns within historical time series data to forecast future states …

Salva Cita Citato da 39 Articoli correlati Tutte e 3 le versioni

[Free GPT-4]

[PDF] arxiv.org

A survey of mamba

H Qu, L Ning, R An, W Fan, T Derr, H Liu, X Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

As one of the most representative DL techniques, Transformer architecture has empowered
numerous advanced models, especially the large language models (LLMs) that comprise …

Salva Cita Citato da 20 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

The hidden attention of mamba models

A Ali, I Zimerman, L Wolf - arxiv preprint arxiv:2403.01590, 2024 - arxiv.org

The Mamba layer offers an efficient selective state space model (SSM) that is highly effective
in modeling multiple domains including NLP, long-range sequences processing, and …

Salva Cita Citato da 44 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Pointrwkv: Efficient rwkv-like model for hierarchical point cloud learning

Q He, J Zhang, J Peng, H He, X Li, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Transformers have revolutionized the point cloud learning task, but the quadratic complexity
hinders its extension to long sequence and makes a burden on limited computational …

Salva Cita Citato da 11 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

A survey on efficient inference for large language models

Z Zhou, X Ning, K Hong, T Fu, J Xu, S Li, Y Lou… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …

Salva Cita Citato da 74 Articoli correlati Tutte e 5 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

State space model for new-generation network alternative to transformers: A survey

X Wang, S Wang, Y Ding, Y Li, W Wu, Y Rong… - arxiv preprint arxiv …, 2024 - arxiv.org

In the post-deep learning era, the Transformer architecture has demonstrated its powerful
performance across pre-trained big models and various downstream tasks. However, the …

Salva Cita Citato da 38 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Zamba: A Compact 7B SSM Hybrid Model

P Glorioso, Q Anthony, Y Tokpanov… - arxiv preprint arxiv …, 2024 - arxiv.org

In this technical report, we present Zamba, a novel 7B SSM-transformer hybrid model which
achieves competitive performance against leading open-weight models at a comparable …

Salva Cita Citato da 25 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] acm.org

Inference optimization of foundation models on ai accelerators

Y Park, K Budhathoki, L Chen, JM Kübler… - Proceedings of the 30th …, 2024 - dl.acm.org

Powerful foundation models, including large language models (LLMs), with Transformer
architectures have ushered in a new era of Generative AI across various industries. Industry …

Salva Cita Citato da 2 Articoli correlati Tutte e 4 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Blackmamba: Mixture of experts for state-space models

A survey on visual mamba

Mamba-360: Survey of state space models as transformer alternative for long sequence modelling: Methods, applications, and challenges

Is mamba effective for time series forecasting?

A survey of mamba

The hidden attention of mamba models

Pointrwkv: Efficient rwkv-like model for hierarchical point cloud learning

A survey on efficient inference for large language models

State space model for new-generation network alternative to transformers: A survey

Zamba: A Compact 7B SSM Hybrid Model

Inference optimization of foundation models on ai accelerators