Μελετητής Google

M Poli, S Massaroli, E Nguyen, DY Fu… - International …, 2023 - proceedings.mlr.press

Recent advances in deep learning have relied heavily on the use of large Transformers due
to their ability to learn at scale. However, the core building block of Transformers, the …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 295 Σχετικά άρθρα Όλες οι 6 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] neurips.cc

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in Neural …, 2022 - proceedings.neurips.cc

Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 1730 Σχετικά άρθρα Όλες οι 10 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] neurips.cc

Hippo: Recurrent memory with optimal polynomial projections

A Gu, T Dao, S Ermon, A Rudra… - Advances in neural …, 2020 - proceedings.neurips.cc

A central problem in learning from sequential data is representing cumulative history in an
incremental fashion as more data is processed. We introduce a general framework (HiPPO) …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 486 Σχετικά άρθρα Όλες οι 9 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] arxiv.org

Randomized numerical linear algebra: Foundations and algorithms

PG Martinsson, JA Tropp - Acta Numerica, 2020 - cambridge.org

This survey describes probabilistic algorithms for linear algebraic computations, such as
factorizing matrices and solving linear systems. It focuses on techniques that have a proven …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 422 Σχετικά άρθρα Όλες οι 16 εκδοχές

[Free GPT-4]

[PDF] mlr.press

Simple hardware-efficient long convolutions for sequence modeling

DY Fu, EL Epstein, E Nguyen… - International …, 2023 - proceedings.mlr.press

State space models (SSMs) have high performance on long sequence modeling but require
sophisticated initialization techniques and specialized implementations for high quality and …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 57 Σχετικά άρθρα Όλες οι 8 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] mlr.press

Monarch: Expressive structured matrices for efficient and accurate training

T Dao, B Chen, NS Sohoni, A Desai… - International …, 2022 - proceedings.mlr.press

Large neural networks excel in many domains, but they are expensive to train and fine-tune.
A popular approach to reduce their compute or memory requirements is to replace dense …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 99 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] neurips.cc

Scatterbrain: Unifying sparse and low-rank attention

B Chen, T Dao, E Winsor, Z Song… - Advances in Neural …, 2021 - proceedings.neurips.cc

Recent advances in efficient Transformers have exploited either the sparsity or low-rank
properties of attention matrices to reduce the computational and memory bottlenecks of …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 139 Σχετικά άρθρα Όλες οι 7 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] mlr.press

Graph structure of neural networks

J You, J Leskovec, K He, S ** the artificial general intelligence landscape …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 19 Σχετικά άρθρα Όλες οι 3 εκδοχές Προβολή ως HTML

[Free GPT-4]

[PDF] thecvf.com

Fast sparse convnets

E Elsen, M Dukhan, T Gale… - Proceedings of the …, 2020 - openaccess.thecvf.com

Historically, the pursuit of efficient inference has been one of the driving forces behind the
research into new deep learning architectures and building blocks. Some of the recent …

Αποθήκευση Παράθεση Γίνεται αναφορά σε 173 Σχετικά άρθρα Όλες οι 10 εκδοχές Προβολή ως HTML

Δημιουργία ειδοποίησης

Παράθεση

Σύνθετη αναζήτηση

Αποθηκεύτηκε στη Βιβλιοθήκη μου

Learning fast algorithms for linear transforms using butterfly factorizations

Hyena hierarchy: Towards larger convolutional language models

Flashattention: Fast and memory-efficient exact attention with io-awareness

Hippo: Recurrent memory with optimal polynomial projections

Randomized numerical linear algebra: Foundations and algorithms

Simple hardware-efficient long convolutions for sequence modeling

Monarch: Expressive structured matrices for efficient and accurate training

Scatterbrain: Unifying sparse and low-rank attention

Graph structure of neural networks

Fast sparse convnets