„Google“ mokslinčius

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces

A Gu, T Dao - arxiv preprint arxiv:2312.00752, 2023 - minjiazhang.github.io

Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

Išsaugoti Cituoti Cituoja 2293 Susiję straipsniai Visos 11 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Transformers are ssms: Generalized models and efficient algorithms through structured state space duality

T Dao, A Gu - arxiv preprint arxiv:2405.21060, 2024 - arxiv.org

While Transformers have been the main architecture behind deep learning's success in
language modeling, state-space models (SSMs) such as Mamba have recently been shown …

Išsaugoti Cituoti Cituoja 332 Susiję straipsniai Visos 6 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Simplified state space layers for sequence modeling

JTH Smith, A Warrington, SW Linderman - arxiv preprint arxiv:2208.04933, 2022 - arxiv.org

Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …

Išsaugoti Cituoti Cituoja 495 Susiję straipsniai Visos 5 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

State space models for event cameras

N Zubic, M Gehrig… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Today state-of-the-art deep neural networks that process event-camera data first convert a
temporal window of events into dense grid-like input representations. As such they exhibit …

Išsaugoti Cituoti Cituoja 35 Susiję straipsniai Visos 10 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The hidden attention of mamba models

A Ali, I Zimerman, L Wolf - arxiv preprint arxiv:2403.01590, 2024 - arxiv.org

The Mamba layer offers an efficient selective state space model (SSM) that is highly effective
in modeling multiple domains, including NLP, long-range sequence processing, and …

Išsaugoti Cituoti Cituoja 51 Susiję straipsniai Visos 3 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The illusion of state in state-space models

W Merrill, J Petty, A Sabharwal - arxiv preprint arxiv:2404.08819, 2024 - arxiv.org

State-space models (SSMs) have emerged as a potential alternative architecture for building
large language models (LLMs) compared to the previously ubiquitous transformer …

Išsaugoti Cituoti Cituoja 41 Susiję straipsniai Visos 7 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Convolutional state space models for long-range spatiotemporal modeling

J Smith, S De Mello, J Kautz… - Advances in Neural …, 2023 - proceedings.neurips.cc

Effectively modeling long spatiotemporal sequences is challenging due to the need to model
complex spatial correlations and long-range temporal dependencies simultaneously …

Išsaugoti Cituoti Cituoja 21 Susiję straipsniai Visos 9 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

[PDF][PDF] Eagle and finch: Rwkv with matrix-valued states and dynamic recurrence

B Peng, D Goldstein, Q Anthony, A Albalak… - arxiv preprint arxiv …, 2024 - openreview.net

Abstract We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving
upon the RWKV (RWKV-4)(Peng et al., 2023) architecture. Our architectural design …

Išsaugoti Cituoti Cituoja 46 Susiję straipsniai Visos 4 versijos HTML kopija

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{GraphChi}:{Large-Scale} graph computation on just a {PC}

A Kyrola, G Blelloch, C Guestrin - 10th USENIX symposium on operating …, 2012 - usenix.org

Current systems for graph computation require a distributed computing cluster to handle
very large real-world problems, such as analysis on social networks or the web graph. While …

Išsaugoti Cituoti Cituoja 1525 Susiję straipsniai Visos 32 versijos HTML kopija

[KNYGA][B] Structured parallel programming: patterns for efficient computation

M McCool, J Reinders, A Robison - 2012 - books.google.com

Structured Parallel Programming offers the simplest way for developers to learn patterns for
high-performance parallel programming. Written by parallel computing experts and industry …

Išsaugoti Cituoti Cituoja 831 Susiję straipsniai Visos 4 versijos Paieška bibliotekoje

Cituoti

Išplėstinė paieška

Išsaugota skiltyje „Mano biblioteka“

[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces

Transformers are ssms: Generalized models and efficient algorithms through structured state space duality

Simplified state space layers for sequence modeling

State space models for event cameras

The hidden attention of mamba models

The illusion of state in state-space models

Convolutional state space models for long-range spatiotemporal modeling

[PDF][PDF] Eagle and finch: Rwkv with matrix-valued states and dynamic recurrence

{GraphChi}:{Large-Scale} graph computation on just a {PC}

[KNYGA][B] Structured parallel programming: patterns for efficient computation