Google Академик

S Khan, M Naseer, M Hayat, SW Zamir… - ACM computing …, 2022 - dl.acm.org

Astounding results from Transformer models on natural language tasks have intrigued the
vision community to study their application to computer vision problems. Among their salient …

Сачувај Цитирај 2980 пута наведен Сродни чланци Све верзије (8)

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Vector symbolic architectures as a computing framework for emerging hardware

D Kleyko, M Davies, EP Frady, P Kanerva… - Proceedings of the …, 2022 - ieeexplore.ieee.org

This article reviews recent progress in the development of the computing framework vector
symbolic architectures (VSA)(also known as hyperdimensional computing). This framework …

Сачувај Цитирај 112 пута наведен Сродни чланци Све верзије (14)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Towards revealing the mystery behind chain of thought: a theoretical perspective

G Feng, B Zhang, Y Gu, H Ye, D He… - Advances in Neural …, 2023 - proceedings.neurips.cc

Recent studies have discovered that Chain-of-Thought prompting (CoT) can dramatically
improve the performance of Large Language Models (LLMs), particularly when dealing with …

Сачувај Цитирај 211 пута наведен Сродни чланци Све верзије (6) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Y Bai, F Chen, H Wang, C **ong… - Advances in neural …, 2023 - proceedings.neurips.cc

Neural sequence models based on the transformer architecture have demonstrated
remarkable\emph {in-context learning}(ICL) abilities, where they can perform new tasks …

Сачувај Цитирај 208 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Trained transformers learn linear models in-context

R Zhang, S Frei, PL Bartlett - Journal of Machine Learning Research, 2024 - jmlr.org

Attention-based neural networks such as transformers have demonstrated a remarkable
ability to exhibit in-context learning (ICL): Given a short prompt sequence of tokens from an …

Сачувај Цитирај 218 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

What can transformers learn in-context? a case study of simple function classes

S Garg, D Tsipras, PS Liang… - Advances in Neural …, 2022 - proceedings.neurips.cc

In-context learning is the ability of a model to condition on a prompt sequence consisting of
in-context examples (input-output pairs corresponding to some task) along with a new query …

Сачувај Цитирај 447 пута наведен Сродни чланци Све верзије (9) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Representational strengths and limitations of transformers

C Sanford, DJ Hsu, M Telgarsky - Advances in Neural …, 2023 - proceedings.neurips.cc

Attention layers, as commonly used in transformers, form the backbone of modern deep
learning, yet there is no mathematical description of their benefits and deficiencies as …

Сачувај Цитирај 86 пута наведен Сродни чланци Све верзије (11) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Attention is not all you need: Pure attention loses rank doubly exponentially with depth

Y Dong, JB Cordonnier… - … conference on machine …, 2021 - proceedings.mlr.press

Attention-based architectures have become ubiquitous in machine learning. Yet, our
understanding of the reasons for their effectiveness remains limited. This work proposes a …

Сачувај Цитирај 436 пута наведен Сродни чланци Све верзије (7) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Big bird: Transformers for longer sequences

M Zaheer, G Guruganesh, KA Dubey… - Advances in neural …, 2020 - proceedings.neurips.cc

Transformers-based models, such as BERT, have been one of the most successful deep
learning models for NLP. Unfortunately, one of their core limitations is the quadratic …

Сачувај Цитирај 2506 пута наведен Сродни чланци Све верзије (8) HTML верзија

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

[PDF][PDF] Chain of thought empowers transformers to solve inherently serial problems

Z Li, H Liu, D Zhou, T Ma - arxiv preprint arxiv:2402.12875, 2024 - academia.edu

Instructing the model to generate a sequence of intermediate steps, aka, a chain of thought
(CoT), is a highly effective method to improve the accuracy of large language models (LLMs) …

Сачувај Цитирај 91 пута наведен Сродни чланци Све верзије (4) HTML верзија

Направи обавештење

Цитирај

Напредна претрага

Сачувано у мојој библиотеци

On the turing completeness of modern neural network architectures

Transformers in vision: A survey

Vector symbolic architectures as a computing framework for emerging hardware

Towards revealing the mystery behind chain of thought: a theoretical perspective

Transformers as statisticians: Provable in-context learning with in-context algorithm selection

Trained transformers learn linear models in-context

What can transformers learn in-context? a case study of simple function classes

Representational strengths and limitations of transformers

Attention is not all you need: Pure attention loses rank doubly exponentially with depth

Big bird: Transformers for longer sequences

[PDF][PDF] Chain of thought empowers transformers to solve inherently serial problems