Google Académico

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

Guardar Citar Citado por 1704 Artículos relacionados Las 8 versiones

[Free GPT-4]

[PDF] arxiv.org

Deep reinforcement learning in computer vision: a comprehensive survey

N Le, VS Rathour, K Yamazaki, K Luu… - Artificial Intelligence …, 2022 - Springer

Deep reinforcement learning augments the reinforcement learning framework and utilizes
the powerful representation of deep neural networks. Recent works have demonstrated the …

Guardar Citar Citado por 212 Artículos relacionados Las 10 versiones

[Free GPT-4]

[PDF] arxiv.org

Voyager: An open-ended embodied agent with large language models

G Wang, Y **e, Y Jiang, A Mandlekar, C **ao… - arxiv preprint arxiv …, 2023 - arxiv.org

We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft
that continuously explores the world, acquires diverse skills, and makes novel discoveries …

Guardar Citar Citado por 789 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Perceiver-actor: A multi-task transformer for robotic manipulation

M Shridhar, L Manuelli, D Fox - Conference on Robot …, 2023 - proceedings.mlr.press

Transformers have revolutionized vision and natural language processing with their ability to
scale with large datasets. But in robotic manipulation, data is both limited and expensive …

Guardar Citar Citado por 462 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] neurips.cc

Minedojo: Building open-ended embodied agents with internet-scale knowledge

L Fan, G Wang, Y Jiang, A Mandlekar… - Advances in …, 2022 - proceedings.neurips.cc

Autonomous agents have made great strides in specialist domains like Atari games and Go.
However, they typically learn tabula rasa in isolated environments with limited and manually …

Guardar Citar Citado por 357 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Self-play fine-tuning converts weak language models to strong language models

Z Chen, Y Deng, H Yuan, K Ji, Q Gu - arxiv preprint arxiv:2401.01335, 2024 - arxiv.org

Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is
pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the …

Guardar Citar Citado por 188 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]

[PDF] caltech.edu

[PDF][PDF] Vima: General robot manipulation with multimodal prompts

Y Jiang, A Gupta, Z Zhang, G Wang… - arxiv preprint …, 2022 - authors.library.caltech.edu

Prompt-based learning has emerged as a successful paradigm in natural language
processing, where a single general-purpose language model can be instructed to perform …

Guardar Citar Citado por 205 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] neurips.cc

Uncertainty-based offline reinforcement learning with diversified q-ensemble

G An, S Moon, JH Kim… - Advances in neural …, 2021 - proceedings.neurips.cc

Offline reinforcement learning (offline RL), which aims to find an optimal policy from a
previously collected static dataset, bears algorithmic difficulties due to function …

Guardar Citar Citado por 305 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Language instructed reinforcement learning for human-ai coordination

H Hu, D Sadigh - International Conference on Machine …, 2023 - proceedings.mlr.press

One of the fundamental quests of AI is to produce agents that coordinate well with humans.
This problem is challenging, especially in domains that lack high quality human behavioral …

Guardar Citar Citado por 62 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]

[PDF] annualreviews.org

Toward a theoretical foundation of policy optimization for learning control policies

B Hu, K Zhang, N Li, M Mesbahi… - Annual Review of …, 2023 - annualreviews.org

Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …

Guardar Citar Citado por 87 Artículos relacionados Las 6 versiones

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Alphastar: Mastering the real-time strategy game starcraft ii

Multi-agent reinforcement learning: A selective overview of theories and algorithms

Deep reinforcement learning in computer vision: a comprehensive survey

Voyager: An open-ended embodied agent with large language models

Perceiver-actor: A multi-task transformer for robotic manipulation

Minedojo: Building open-ended embodied agents with internet-scale knowledge

Self-play fine-tuning converts weak language models to strong language models

[PDF][PDF] Vima: General robot manipulation with multimodal prompts

Uncertainty-based offline reinforcement learning with diversified q-ensemble

Language instructed reinforcement learning for human-ai coordination

Toward a theoretical foundation of policy optimization for learning control policies