Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
Deep reinforcement learning in computer vision: a comprehensive survey
Deep reinforcement learning augments the reinforcement learning framework and utilizes
the powerful representation of deep neural networks. Recent works have demonstrated the …
the powerful representation of deep neural networks. Recent works have demonstrated the …
Voyager: An open-ended embodied agent with large language models
We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft
that continuously explores the world, acquires diverse skills, and makes novel discoveries …
that continuously explores the world, acquires diverse skills, and makes novel discoveries …
Perceiver-actor: A multi-task transformer for robotic manipulation
Transformers have revolutionized vision and natural language processing with their ability to
scale with large datasets. But in robotic manipulation, data is both limited and expensive …
scale with large datasets. But in robotic manipulation, data is both limited and expensive …
Minedojo: Building open-ended embodied agents with internet-scale knowledge
Autonomous agents have made great strides in specialist domains like Atari games and Go.
However, they typically learn tabula rasa in isolated environments with limited and manually …
However, they typically learn tabula rasa in isolated environments with limited and manually …
Self-play fine-tuning converts weak language models to strong language models
Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is
pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the …
pivotal for advancing Large Language Models (LLMs). In this paper, we delve into the …
[PDF][PDF] Vima: General robot manipulation with multimodal prompts
Prompt-based learning has emerged as a successful paradigm in natural language
processing, where a single general-purpose language model can be instructed to perform …
processing, where a single general-purpose language model can be instructed to perform …
Uncertainty-based offline reinforcement learning with diversified q-ensemble
Offline reinforcement learning (offline RL), which aims to find an optimal policy from a
previously collected static dataset, bears algorithmic difficulties due to function …
previously collected static dataset, bears algorithmic difficulties due to function …
Language instructed reinforcement learning for human-ai coordination
One of the fundamental quests of AI is to produce agents that coordinate well with humans.
This problem is challenging, especially in domains that lack high quality human behavioral …
This problem is challenging, especially in domains that lack high quality human behavioral …
Toward a theoretical foundation of policy optimization for learning control policies
Gradient-based methods have been widely used for system design and optimization in
diverse application domains. Recently, there has been a renewed interest in studying …
diverse application domains. Recently, there has been a renewed interest in studying …