- Academic Search

A Gu, T Dao - arxiv preprint arxiv:2312.00752, 2023 - arxiv.org

Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

Save Cite Cited by 2051 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Efficiently modeling long sequences with structured state spaces

A Gu, K Goel, C Ré - arxiv preprint arxiv:2111.00396, 2021 - arxiv.org

A central goal of sequence modeling is designing a single principled model that can
address sequence data across a range of modalities and tasks, particularly on long-range …

Save Cite Cited by 1644 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Resurrecting recurrent neural networks for long sequences

A Orvieto, SL Smith, A Gu, A Fernando… - International …, 2023 - proceedings.mlr.press

Abstract Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are
hard to optimize and slow to train. Deep state-space models (SSMs) have recently been …

Save Cite Cited by 255 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

On the parameterization and initialization of diagonal state space models

A Gu, K Goel, A Gupta, C Ré - Advances in Neural …, 2022 - proceedings.neurips.cc

State space models (SSM) have recently been shown to be very effective as a deep learning
layer as a promising alternative to sequence models such as RNNs, CNNs, or Transformers …

Save Cite Cited by 318 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Hungry hungry hippos: Towards language modeling with state space models

DY Fu, T Dao, KK Saab, AW Thomas, A Rudra… - arxiv preprint arxiv …, 2022 - arxiv.org

State space models (SSMs) have demonstrated state-of-the-art sequence modeling
performance in some modalities, but underperform attention in language modeling …

Save Cite Cited by 436 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Simplified state space layers for sequence modeling

JTH Smith, A Warrington, SW Linderman - arxiv preprint arxiv:2208.04933, 2022 - arxiv.org

Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …

Save Cite Cited by 470 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

S4nd: Modeling images and videos as multidimensional signals with state spaces

E Nguyen, K Goel, A Gu, G Downs… - Advances in neural …, 2022 - proceedings.neurips.cc

Visual data such as images and videos are typically modeled as discretizations of inherently
continuous, multidimensional signals. Existing continuous-signal models attempt to exploit …

Save Cite Cited by 197 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Pointmamba: A simple state space model for point cloud analysis

D Liang, X Zhou, W Xu, X Zhu, Z Zou, X Ye… - arxiv preprint arxiv …, 2024 - arxiv.org

Transformers have become one of the foundational architectures in point cloud analysis
tasks due to their excellent global modeling ability. However, the attention mechanism has …

Save Cite Cited by 103 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A survey on vision mamba: Models, applications and challenges

R Xu, S Yang, Y Wang, B Du, H Chen - arxiv preprint arxiv:2404.18861, 2024 - arxiv.org

Mamba, a recent selective structured state space model, performs excellently on long
sequence modeling tasks. Mamba mitigates the modeling constraints of convolutional …

Save Cite Cited by 56 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Simple hardware-efficient long convolutions for sequence modeling

DY Fu, EL Epstein, E Nguyen… - International …, 2023 - proceedings.mlr.press

State space models (SSMs) have high performance on long sequence modeling but require
sophisticated initialization techniques and specialized implementations for high quality and …

Save Cite Cited by 57 Related articles All 8 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

How to train your hippo: State space models with generalized orthogonal basis projections

Mamba: Linear-time sequence modeling with selective state spaces

Efficiently modeling long sequences with structured state spaces

Resurrecting recurrent neural networks for long sequences

On the parameterization and initialization of diagonal state space models

Hungry hungry hippos: Towards language modeling with state space models

Simplified state space layers for sequence modeling

S4nd: Modeling images and videos as multidimensional signals with state spaces

Pointmamba: A simple state space model for point cloud analysis

A survey on vision mamba: Models, applications and challenges

Simple hardware-efficient long convolutions for sequence modeling