Google Učenjak

G Menghani - ACM Computing Surveys, 2023 - dl.acm.org

Deep learning has revolutionized the fields of computer vision, natural language
understanding, speech recognition, information retrieval, and more. However, with the …

Shrani Navedi Navedeno v 461 virih Sorodni članki Vse različice: 5

Model compression and hardware acceleration for neural networks: A comprehensive survey

L Deng, G Li, S Han, L Shi, Y **e - Proceedings of the IEEE, 2020 - ieeexplore.ieee.org

Domain-specific hardware is becoming a promising topic in the backdrop of improvement
slow down for general-purpose processors due to the foreseeable end of Moore's Law …

Shrani Navedi Navedeno v 988 virih Sorodni članki Vse različice: 2

[Free GPT-4]
[DeepSeek]

[PDF] github.io

[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces

A Gu, T Dao - arxiv preprint arxiv:2312.00752, 2023 - minjiazhang.github.io

Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

Shrani Navedi Navedeno v 2385 virih Sorodni članki Vse različice: 11 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Transformers are ssms: Generalized models and efficient algorithms through structured state space duality

T Dao, A Gu - arxiv preprint arxiv:2405.21060, 2024 - arxiv.org

While Transformers have been the main architecture behind deep learning's success in
language modeling, state-space models (SSMs) such as Mamba have recently been shown …

Shrani Navedi Navedeno v 351 virih Sorodni članki Vse različice: 6 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arxiv preprint arxiv …, 2023 - arxiv.org

Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

Shrani Navedi Navedeno v 484 virih Sorodni članki Vse različice: 9 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Simplified state space layers for sequence modeling

JTH Smith, A Warrington, SW Linderman - arxiv preprint arxiv:2208.04933, 2022 - arxiv.org

Models using structured state space sequence (S4) layers have achieved state-of-the-art
performance on long-range sequence modeling tasks. An S4 layer combines linear state …

Shrani Navedi Navedeno v 503 virih Sorodni članki Vse različice: 5 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Combining recurrent, convolutional, and continuous-time models with linear state space layers

A Gu, I Johnson, K Goel, K Saab… - Advances in neural …, 2021 - proceedings.neurips.cc

Recurrent neural networks (RNNs), temporal convolutions, and neural differential equations
(NDEs) are popular families of deep learning models for time-series data, each with unique …

Shrani Navedi Navedeno v 569 virih Sorodni članki Vse različice: 8 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Deep equilibrium models

S Bai, JZ Kolter, V Koltun - Advances in neural information …, 2019 - proceedings.neurips.cc

We present a new approach to modeling sequential data: the deep equilibrium model
(DEQ). Motivated by an observation that the hidden layers of many existing deep sequence …

Shrani Navedi Navedeno v 799 virih Sorodni članki Vse različice: 16 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Repeat after me: Transformers are better than state space models at copying

S Jelassi, D Brandfonbrener, SM Kakade… - arxiv preprint arxiv …, 2024 - arxiv.org

Transformers are the dominant architecture for sequence modeling, but there is growing
interest in models that use a fixed-size latent state that does not depend on the sequence …

Shrani Navedi Navedeno v 69 virih Sorodni članki Vse različice: 6 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] pnas.org Full View

The neural architecture of language: Integrative modeling converges on predictive processing

M Schrimpf, IA Blank, G Tuckute, C Kauf… - Proceedings of the …, 2021 - pnas.org

The neuroscience of perception has recently been revolutionized with an integrative
modeling approach in which computation, brain function, and behavior are linked across …

Shrani Navedi Navedeno v 419 virih Sorodni članki Vse različice: 21

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

Quasi-recurrent neural networks

Efficient deep learning: A survey on making deep learning models smaller, faster, and better

Model compression and hardware acceleration for neural networks: A comprehensive survey

[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces

Transformers are ssms: Generalized models and efficient algorithms through structured state space duality

Rwkv: Reinventing rnns for the transformer era

Simplified state space layers for sequence modeling

Combining recurrent, convolutional, and continuous-time models with linear state space layers

Deep equilibrium models

Repeat after me: Transformers are better than state space models at copying

The neural architecture of language: Integrative modeling converges on predictive processing