Google Tudós

S Hu, L Shen, Y Zhang, Y Chen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org

Transformers, originally devised for natural language processing (NLP), have also produced
significant successes in computer vision (CV). Due to their strong expression power …

Mentés Hivatkozás Idézetek száma: 36 Kapcsolódó cikkek Mind a(z) 7 változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Reinforced self-training (rest) for language modeling

C Gulcehre, TL Paine, S Srinivasan… - arxiv preprint arxiv …, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) can improve the quality of large
language model's (LLM) outputs by aligning them with human preferences. We propose a …

Mentés Hivatkozás Idézetek száma: 230 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A generalist agent

S Reed, K Zolna, E Parisotto, SG Colmenarejo… - arxiv preprint arxiv …, 2022 - arxiv.org

Inspired by progress in large-scale language modeling, we apply a similar approach
towards building a single generalist agent beyond the realm of text outputs. The agent …

Mentés Hivatkozás Idézetek száma: 1011 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Roboagent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking

H Bharadhwaj, J Vakil, M Sharma… - … on Robotics and …, 2024 - ieeexplore.ieee.org

The grand aim of having a single robot that can manipulate arbitrary objects in diverse
settings is at odds with the paucity of robotics datasets. Acquiring and growing such datasets …

Mentés Hivatkozás Idézetek száma: 96 Kapcsolódó cikkek Mind a(z) 4 változat

[Free GPT-4]
[DeepSeek]

[PDF] mpg.de

Replay in minds and machines

L Wittkuhn, S Chien, S Hall-McMaster… - … & Biobehavioral Reviews, 2021 - Elsevier

Experience-related brain activity patterns reactivate during sleep, wakeful rest, and brief
pauses from active behavior. In parallel, machine learning research has found that …

Mentés Hivatkozás Idézetek száma: 54 Kapcsolódó cikkek Mind a(z) 14 változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Collaborating with humans without human data

DJ Strouse, K McKee, M Botvinick… - Advances in …, 2021 - proceedings.neurips.cc

Collaborating with humans requires rapidly adapting to their individual strengths,
weaknesses, and preferences. Unfortunately, most standard multi-agent reinforcement …

Mentés Hivatkozás Idézetek száma: 188 Kapcsolódó cikkek Mind a(z) 7 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] github.io

A virtual rodent predicts the structure of neural activity across behaviours

D Aldarondo, J Merel, JD Marshall, L Hasenclever… - Nature, 2024 - nature.com

Animals have exquisite control of their bodies, allowing them to perform a diverse range of
behaviours. How such control is implemented by the brain, however, remains unclear …

Mentés Hivatkozás Idézetek száma: 21 Kapcsolódó cikkek Mind a(z) 7 változat

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Stabilizing transformers for reinforcement learning

E Parisotto, F Song, J Rae, R Pascanu… - International …, 2020 - proceedings.mlr.press

Owing to their ability to both effectively integrate information over long time horizons and
scale to massive amounts of data, self-attention architectures have recently shown …

Mentés Hivatkozás Idézetek száma: 447 Kapcsolódó cikkek Mind a(z) 9 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

What matters in on-policy reinforcement learning? a large-scale empirical study

M Andrychowicz, A Raichuk, P Stańczyk… - arxiv preprint arxiv …, 2020 - arxiv.org

In recent years, on-policy reinforcement learning (RL) has been successfully applied to
many different continuous control tasks. While RL algorithms are often conceptually simple …

Mentés Hivatkozás Idézetek száma: 262 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

What matters for on-policy deep actor-critic methods? a large-scale study

M Andrychowicz, A Raichuk, P Stańczyk… - International …, 2021 - openreview.net

In recent years, reinforcement learning (RL) has been successfully applied to many different
continuous control tasks. While RL algorithms are often conceptually simple, their state-of …

Mentés Hivatkozás Idézetek száma: 211 Kapcsolódó cikkek Mind a(z) 4 változat HTML-változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

V-mpo: On-policy maximum a posteriori policy optimization for discrete and continuous control

On transforming reinforcement learning with transformers: The development trajectory

Reinforced self-training (rest) for language modeling

A generalist agent

Roboagent: Generalization and efficiency in robot manipulation via semantic augmentations and action chunking

Replay in minds and machines

Collaborating with humans without human data

A virtual rodent predicts the structure of neural activity across behaviours

Stabilizing transformers for reinforcement learning

What matters in on-policy reinforcement learning? a large-scale empirical study

What matters for on-policy deep actor-critic methods? a large-scale study