- Academic Search

Z Niu, G Zhong, H Yu - Neurocomputing, 2021 - Elsevier

Attention has arguably become one of the most important concepts in the deep learning
field. It is inspired by the biological systems of humans that tend to focus on the distinctive …

Simpan Kutip Dirujuk 2398 kali Artikel terkait 4 versi

[Free GPT-4]

[PDF] nowpublishers.com

Social interactions for autonomous driving: A review and perspectives

W Wang, L Wang, C Zhang, C Liu… - Foundations and Trends …, 2022 - nowpublishers.com

No human drives a car in a vacuum; she/he must negotiate with other road users to achieve
their goals in social traffic scenes. A rational human driver can interact with other road users …

Simpan Kutip Dirujuk 145 kali Artikel terkait 10 versi Pencarian Perpustakaan Versi HTML

[Free GPT-4]

[PDF] neurips.cc

Transformers learn to implement preconditioned gradient descent for in-context learning

K Ahn, X Cheng, H Daneshmand… - Advances in Neural …, 2023 - proceedings.neurips.cc

Several recent works demonstrate that transformers can implement algorithms like gradient
descent. By a careful construction of weights, these works show that multiple layers of …

Simpan Kutip Dirujuk 162 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]

[PDF] arxiv.org

Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct

H Luo, Q Sun, C Xu, P Zhao, J Lou, C Tao… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs), such as GPT-4, have shown remarkable performance in
natural language processing (NLP) tasks, including challenging mathematical reasoning …

Simpan Kutip Dirujuk 330 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]

[PDF] arxiv.org

On the opportunities and risks of foundation models

R Bommasani, DA Hudson, E Adeli, R Altman… - arxiv preprint arxiv …, 2021 - arxiv.org

AI is undergoing a paradigm shift with the rise of models (eg, BERT, DALL-E, GPT-3) that are
trained on broad data at scale and are adaptable to a wide range of downstream tasks. We …

Simpan Kutip Dirujuk 4676 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]

[PDF] arxiv.org

Program synthesis with large language models

J Austin, A Odena, M Nye, M Bosma… - arxiv preprint arxiv …, 2021 - arxiv.org

This paper explores the limits of the current generation of large language models for
program synthesis in general purpose programming languages. We evaluate a collection of …

Simpan Kutip Dirujuk 1386 kali Artikel terkait 3 versi Versi HTML

[Free GPT-4]

[PDF] aaai.org

Memorybank: Enhancing large language models with long-term memory

W Zhong, L Guo, Q Gao, H Ye, Y Wang - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Abstract Large Language Models (LLMs) have drastically reshaped our interactions with
artificial intelligence (AI) systems, showcasing impressive performance across an extensive …

Simpan Kutip Dirujuk 170 kali Artikel terkait 4 versi Versi HTML

[Free GPT-4]

[PDF] arxiv.org

Show your work: Scratchpads for intermediate computation with language models

M Nye, AJ Andreassen, G Gur-Ari… - arxiv preprint arxiv …, 2021 - arxiv.org

Large pre-trained language models perform remarkably well on tasks that can be done" in
one pass", such as generating realistic text or synthesizing computer programs. However …

Simpan Kutip Dirujuk 584 kali Artikel terkait 5 versi Versi HTML

[Free GPT-4]

[PDF] arxiv.org

RULER: What's the Real Context Size of Your Long-Context Language Models?

CP Hsieh, S Sun, S Kriman, S Acharya… - arxiv preprint arxiv …, 2024 - arxiv.org

The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …

Simpan Kutip Dirujuk 101 kali Artikel terkait 2 versi Versi HTML

[Free GPT-4]

[PDF] mlr.press

Perceiver: General perception with iterative attention

A Jaegle, F Gimeno, A Brock… - International …, 2021 - proceedings.mlr.press

Biological systems understand the world by simultaneously processing high-dimensional
inputs from modalities as diverse as vision, audition, touch, proprioception, etc. The …

Simpan Kutip Dirujuk 1054 kali Artikel terkait 7 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Neural Turing Machines

A review on the attention mechanism of deep learning

Social interactions for autonomous driving: A review and perspectives

Transformers learn to implement preconditioned gradient descent for in-context learning

Wizardmath: Empowering mathematical reasoning for large language models via reinforced evol-instruct

On the opportunities and risks of foundation models

Program synthesis with large language models

Memorybank: Enhancing large language models with long-term memory

Show your work: Scratchpads for intermediate computation with language models

RULER: What's the Real Context Size of Your Long-Context Language Models?

Perceiver: General perception with iterative attention