Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Mamba-360: Survey of state space models as transformer alternative for long sequence modelling: Methods, applications, and challenges
Sequence modeling is a crucial area across various domains, including Natural Language
Processing (NLP), speech recognition, time series forecasting, music generation, and …
Processing (NLP), speech recognition, time series forecasting, music generation, and …
[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces
Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …
almost universally based on the Transformer architecture and its core attention module …
Vmamba: Visual state space model
Y Liu, Y Tian, Y Zhao, H Yu, L **e… - Advances in neural …, 2025 - proceedings.neurips.cc
Designing computationally efficient network architectures remains an ongoing necessity in
computer vision. In this paper, we adapt Mamba, a state-space language model, into …
computer vision. In this paper, we adapt Mamba, a state-space language model, into …
Resurrecting recurrent neural networks for long sequences
Abstract Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are
hard to optimize and slow to train. Deep state-space models (SSMs) have recently been …
hard to optimize and slow to train. Deep state-space models (SSMs) have recently been …
Simplified state space layers for sequence modeling
Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …
performance on long-range sequence modeling tasks. An S4 layer combines linear state …
Gated linear attention transformers with hardware-efficient training
Transformers with linear attention allow for efficient parallel training but can simultaneously
be formulated as an RNN with 2D (matrix-valued) hidden states, thus enjoying linear-time …
be formulated as an RNN with 2D (matrix-valued) hidden states, thus enjoying linear-time …
Rs-mamba for large remote sensing image dense prediction
Context modeling is critical for remote sensing image dense prediction tasks. Nowadays, the
growing size of very-high-resolution (VHR) remote sensing images poses challenges in …
growing size of very-high-resolution (VHR) remote sensing images poses challenges in …
Hierarchically gated recurrent neural network for sequence modeling
Z Qin, S Yang, Y Zhong - Advances in Neural Information …, 2023 - proceedings.neurips.cc
Transformers have surpassed RNNs in popularity due to their superior abilities in parallel
training and long-term dependency modeling. Recently, there has been a renewed interest …
training and long-term dependency modeling. Recently, there has been a renewed interest …
Monarch mixer: A simple sub-quadratic gemm-based architecture
D Fu, S Arora, J Grogan, I Johnson… - Advances in …, 2023 - proceedings.neurips.cc
Abstract Machine learning models are increasingly being scaled in both sequence length
and model dimension to reach longer contexts and better performance. However, existing …
and model dimension to reach longer contexts and better performance. However, existing …
A survey on efficient inference for large language models
Large Language Models (LLMs) have attracted extensive attention due to their remarkable
performance across various tasks. However, the substantial computational and memory …
performance across various tasks. However, the substantial computational and memory …