Študovňa Google

C Zhang, J Chen, J Li, Y Peng, Z Mao - Biomimetic Intelligence and …, 2023 - Elsevier

The fusion of large language models and robotic systems has introduced a transformative
paradigm in human–robot interaction, offering unparalleled capabilities in natural language …

Uložiť Citovať Citované 148-krát Súvisiace články Všetky verzie 3

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] A survey of transformers

T Lin, Y Wang, X Liu, X Qiu - AI open, 2022 - Elsevier

Transformers have achieved great success in many artificial intelligence fields, such as
natural language processing, computer vision, and audio processing. Therefore, it is natural …

Uložiť Citovať Citované 1473-krát Súvisiace články Všetky verzie 4

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Fast inference from transformers via speculative decoding

Y Leviathan, M Kalman… - … Conference on Machine …, 2023 - proceedings.mlr.press

Inference from large autoregressive models like Transformers is slow-decoding K tokens
takes K serial runs of the model. In this work we introduce speculative decoding-an …

Uložiť Citovať Citované 476-krát Súvisiace články Všetky verzie 9 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] mlsys.org

Efficiently scaling transformer inference

R Pope, S Douglas, A Chowdhery… - Proceedings of …, 2023 - proceedings.mlsys.org

We study the problem of efficient generative inference for Transformer models, in one of its
most challenging settings: large deep models, with tight latency targets and long sequence …

Uložiť Citovať Citované 364-krát Súvisiace články Všetky verzie 7 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Flashattention: Fast and memory-efficient exact attention with io-awareness

T Dao, D Fu, S Ermon, A Rudra… - Advances in neural …, 2022 - proceedings.neurips.cc

Transformers are slow and memory-hungry on long sequences, since the time and memory
complexity of self-attention are quadratic in sequence length. Approximate attention …

Uložiť Citovať Citované 1825-krát Súvisiace články Všetky verzie 10 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Convit: Improving vision transformers with soft convolutional inductive biases

S d'Ascoli, H Touvron, ML Leavitt… - International …, 2021 - proceedings.mlr.press

Convolutional architectures have proven extremely successful for vision tasks. Their hard
inductive biases enable sample-efficient learning, but come at the cost of a potentially lower …

Uložiť Citovať Citované 955-krát Súvisiace články Všetky verzie 9 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Perceiver: General perception with iterative attention

A Jaegle, F Gimeno, A Brock… - International …, 2021 - proceedings.mlr.press

Biological systems understand the world by simultaneously processing high-dimensional
inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The …

Uložiť Citovať Citované 1088-krát Súvisiace články Všetky verzie 6 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity

W Fedus, B Zoph, N Shazeer - Journal of Machine Learning Research, 2022 - jmlr.org

In deep learning, models typically reuse the same parameters for all inputs. Mixture of
Experts (MoE) models defy this and instead select different parameters for each incoming …

Uložiť Citovať Citované 2061-krát Súvisiace články Všetky verzie 6 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Memvit: Memory-augmented multiscale vision transformer for efficient long-term video recognition

CY Wu, Y Li, K Mangalam, H Fan… - Proceedings of the …, 2022 - openaccess.thecvf.com

While today's video recognition systems parse snapshots or short clips accurately, they
cannot connect the dots and reason across a longer range of time yet. Most existing video …

Uložiť Citovať Citované 235-krát Súvisiace články Všetky verzie 5 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Attention mechanism in neural networks: where it comes and where it goes

D Soydaner - Neural Computing and Applications, 2022 - Springer

A long time ago in the machine learning literature, the idea of incorporating a mechanism
inspired by the human visual system into neural networks was introduced. This idea is …

Uložiť Citovať Citované 180-krát Súvisiace články Všetky verzie 9

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

Adaptive attention span in transformers

[HTML][HTML] Large language models for human–robot interaction: A review

[HTML][HTML] A survey of transformers

Fast inference from transformers via speculative decoding

Efficiently scaling transformer inference

Flashattention: Fast and memory-efficient exact attention with io-awareness

Convit: Improving vision transformers with soft convolutional inductive biases

Perceiver: General perception with iterative attention

Switch transformers: Scaling to trillion parameter models with simple and efficient sparsity

Memvit: Memory-augmented multiscale vision transformer for efficient long-term video recognition

Attention mechanism in neural networks: where it comes and where it goes