Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Mamba-360: Survey of state space models as transformer alternative for long sequence modelling: Methods, applications, and challenges
Sequence modeling is a crucial area across various domains, including Natural Language
Processing (NLP), speech recognition, time series forecasting, music generation, and …
Processing (NLP), speech recognition, time series forecasting, music generation, and …
A survey on transformer compression
Transformer plays a vital role in the realms of natural language processing (NLP) and
computer vision (CV), specially for constructing large language models (LLM) and large …
computer vision (CV), specially for constructing large language models (LLM) and large …
[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces
Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …
almost universally based on the Transformer architecture and its core attention module …
Vmamba: Visual state space model
Designing computationally efficient network architectures remains an ongoing necessity in
computer vision. In this paper, we adapt Mamba, a state-space language model, into …
computer vision. In this paper, we adapt Mamba, a state-space language model, into …
Transformers are ssms: Generalized models and efficient algorithms through structured state space duality
While Transformers have been the main architecture behind deep learning's success in
language modeling, state-space models (SSMs) such as Mamba have recently been shown …
language modeling, state-space models (SSMs) such as Mamba have recently been shown …
Rwkv: Reinventing rnns for the transformer era
Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …
suffer from memory and computational complexity that scales quadratically with sequence …
Resurrecting recurrent neural networks for long sequences
Abstract Recurrent Neural Networks (RNNs) offer fast inference on long sequences but are
hard to optimize and slow to train. Deep state-space models (SSMs) have recently been …
hard to optimize and slow to train. Deep state-space models (SSMs) have recently been …
Hyena hierarchy: Towards larger convolutional language models
Recent advances in deep learning have relied heavily on the use of large Transformers due
to their ability to learn at scale. However, the core building block of Transformers, the …
to their ability to learn at scale. However, the core building block of Transformers, the …
Hungry hungry hippos: Towards language modeling with state space models
State space models (SSMs) have demonstrated state-of-the-art sequence modeling
performance in some modalities, but underperform attention in language modeling …
performance in some modalities, but underperform attention in language modeling …
Long-term forecasting with tide: Time-series dense encoder
Recent work has shown that simple linear models can outperform several Transformer
based approaches in long term time-series forecasting. Motivated by this, we propose a …
based approaches in long term time-series forecasting. Motivated by this, we propose a …