Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A comprehensive survey of continual learning: Theory, method and application
To cope with real-world dynamics, an intelligent system needs to incrementally acquire,
update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as …
update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as …
Model-based reinforcement learning: A survey
Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …
optimization, is an important challenge in artificial intelligence. Two key approaches to this …
Model-based offline planning
Offline learning is a key part of making reinforcement learning (RL) useable in real systems.
Offline RL looks at scenarios where there is data from a system's operation, but no direct …
Offline RL looks at scenarios where there is data from a system's operation, but no direct …
Continual world: A robotic benchmark for continual reinforcement learning
Abstract Continual learning (CL)---the ability to continuously learn, building on previously
acquired knowledge---is a natural requirement for long-lived autonomous reinforcement …
acquired knowledge---is a natural requirement for long-lived autonomous reinforcement …
Optimizing for the future in non-stationary mdps
Most reinforcement learning methods are based upon the key assumption that the transition
dynamics and reward functions are fixed, that is, the underlying Markov decision process is …
dynamics and reward functions are fixed, that is, the underlying Markov decision process is …
Prediction and control in continual reinforcement learning
Temporal difference (TD) learning is often used to update the estimate of the value function
which is used by RL agents to extract useful policies. In this paper, we focus on value …
which is used by RL agents to extract useful policies. In this paper, we focus on value …
Reset-free lifelong learning with skill-space planning
The objective of lifelong reinforcement learning (RL) is to optimize agents which can
continuously adapt and interact in changing environments. However, current RL approaches …
continuously adapt and interact in changing environments. However, current RL approaches …
Learning skills to patch plans based on inaccurate models
Planners using accurate models can be effective for accomplishing manipulation tasks in the
real world, but are typically highly specialized and require significant fine-tuning to be …
real world, but are typically highly specialized and require significant fine-tuning to be …
Neural-progressive hedging: Enforcing constraints in reinforcement learning with stochastic programming
We propose a framework, called neural-progressive hedging (NP), that leverages stochastic
programming during the online phase of executing a reinforcement learning (RL) policy. The …
programming during the online phase of executing a reinforcement learning (RL) policy. The …
Uncertainty-sensitive learning and planning with ensembles
We propose a reinforcement learning framework for discrete environments in which an
agent makes both strategic and tactical decisions. The former manifests itself through the …
agent makes both strategic and tactical decisions. The former manifests itself through the …