Google Академія

Y Qin, S Hu, Y Lin, W Chen, N Ding, G Cui… - ACM Computing …, 2024 - dl.acm.org

Humans possess an extraordinary ability to create and utilize tools. With the advent of
foundation models, artificial intelligence systems have the potential to be equally adept in …

Зберегти Послатися Цитовано в 312 джерелах Пов’язані статті Кількість версій: 10

Reinforcement learning algorithms: A brief survey

AK Shakya, G Pillai, S Chakrabarty - Expert Systems with Applications, 2023 - Elsevier

Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …

Зберегти Послатися Цитовано в 225 джерелах Пов’язані статті Кількість версій: 2

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Champion-level drone racing using deep reinforcement learning

E Kaufmann, L Bauersfeld, A Loquercio, M Müller… - Nature, 2023 - nature.com

First-person view (FPV) drone racing is a televised sport in which professional competitors
pilot high-speed aircraft through a 3D circuit. Each pilot sees the environment from the …

Зберегти Послатися Цитовано в 450 джерелах Пов’язані статті Кількість версій: 8

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mastering diverse domains through world models

D Hafner, J Pasukonis, J Ba, T Lillicrap - ar** a general algorithm that learns to solve tasks across a wide range of
applications has been a fundamental challenge in artificial intelligence. Although current …

Зберегти Послатися Цитовано в 566 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reasoning with language model is planning with world model

S Hao, Y Gu, H Ma, JJ Hong, Z Wang, DZ Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have shown remarkable reasoning capabilities, especially
when prompted to generate intermediate reasoning steps (eg, Chain-of-Thought, CoT) …

Зберегти Послатися Цитовано в 452 джерелах Пов’язані статті Кількість версій: 10 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Faster sorting algorithms discovered using deep reinforcement learning

DJ Mankowitz, A Michi, A Zhernov, M Gelmi, M Selvi… - Nature, 2023 - nature.com

Fundamental algorithms such as sorting or hashing are used trillions of times on any given
day. As demand for computation grows, it has become critical for these algorithms to be as …

Зберегти Послатися Цитовано в 199 джерелах Пов’язані статті Кількість версій: 9

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Large language models as commonsense knowledge for large-scale task planning

Z Zhao, WS Lee, D Hsu - Advances in Neural Information …, 2023 - proceedings.neurips.cc

Large-scale task planning is a major challenge. Recent work exploits large language
models (LLMs) directly as a policy and shows surprisingly interesting results. This paper …

Зберегти Послатися Цитовано в 193 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Gaia-1: A generative world model for autonomous driving

A Hu, L Russell, H Yeo, Z Murez, G Fedoseev… - arxiv preprint arxiv …, 2023 - arxiv.org

Autonomous driving promises transformative improvements to transportation, but building
systems capable of safely navigating the unstructured complexity of real-world scenarios …

Зберегти Послатися Цитовано в 180 джерелах Пов’язані статті Кількість версій: 2 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

All-analog photoelectronic chip for high-speed vision tasks

Y Chen, M Nazhamaiti, H Xu, Y Meng, T Zhou, G Li… - Nature, 2023 - nature.com

Photonic computing enables faster and more energy-efficient processing of vision data,,,–.
However, experimental superiority of deployable systems remains a challenge because of …

Зберегти Послатися Цитовано в 135 джерелах Пов’язані статті Кількість версій: 9

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Rest-mcts*: Llm self-training via process reward guided tree search

D Zhang, S Zhoubian, Z Hu, Y Yue… - Advances in Neural …, 2025 - proceedings.neurips.cc

Recent methodologies in LLM self-training mostly rely on LLM generating responses and
filtering those with correct output answers as training data. This approach often yields a low …

Зберегти Послатися Цитовано в 72 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Mastering atari, go, chess and shogi by planning with a learned model

Tool learning with foundation models

Reinforcement learning algorithms: A brief survey

Champion-level drone racing using deep reinforcement learning

Mastering diverse domains through world models

Reasoning with language model is planning with world model

Faster sorting algorithms discovered using deep reinforcement learning

Large language models as commonsense knowledge for large-scale task planning

Gaia-1: A generative world model for autonomous driving

All-analog photoelectronic chip for high-speed vision tasks

Rest-mcts*: Llm self-training via process reward guided tree search