Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Statistical learning theory for control: A finite-sample perspective
Learning algorithms have become an integral component to modern engineering solutions.
Examples range from self-driving cars and recommender systems to finance and even …
Examples range from self-driving cars and recommender systems to finance and even …
Efficient model-based reinforcement learning through optimistic policy search and planning
Abstract Model-based reinforcement learning algorithms with probabilistic dynamical
models are amongst the most data-efficient learning methods. This is often attributed to their …
models are amongst the most data-efficient learning methods. This is often attributed to their …
Reinforcement learning with fast stabilization in linear dynamical systems
In this work, we study model-based reinforcement learning (RL) in unknown stabilizable
linear dynamical systems. When learning a dynamical system, one needs to stabilize the …
linear dynamical systems. When learning a dynamical system, one needs to stabilize the …
Minimal expected regret in linear quadratic control
We consider the problem of online learning in Linear Quadratic Control systems whose state
transition and state-action transition matrices $ A $ and $ B $ may be initially unknown. We …
transition and state-action transition matrices $ A $ and $ B $ may be initially unknown. We …
Learning to control linear systems can be hard
In this paper, we study the statistical difficulty of learning to control linear systems. We focus
on two standard benchmarks, the sample complexity of stabilization, and the regret of the …
on two standard benchmarks, the sample complexity of stabilization, and the regret of the …
Regret lower bounds for learning linear quadratic gaussian systems
In this article, we establish regret lower bounds for adaptively controlling an unknown linear
Gaussian system with quadratic costs. We combine ideas from experiment design …
Gaussian system with quadratic costs. We combine ideas from experiment design …
Thompson Sampling Achieves Regret in Linear Quadratic Control
Thompson Sampling (TS) is an efficient method for decision-making under uncertainty,
where an action is sampled from a carefully prescribed distribution which is updated based …
where an action is sampled from a carefully prescribed distribution which is updated based …
Identification and adaptive control of markov jump systems: Sample complexity and regret bounds
Learning how to effectively control unknown dynamical systems is crucial for intelligent
autonomous systems. This task becomes a significant challenge when the underlying …
autonomous systems. This task becomes a significant challenge when the underlying …
NeoRL: Efficient Exploration for Nonepisodic RL
We study the problem of nonepisodic reinforcement learning (RL) for nonlinear dynamical
systems, where the system dynamics are unknown and the RL agent has to learn from a …
systems, where the system dynamics are unknown and the RL agent has to learn from a …
Task-optimal exploration in linear dynamical systems
Exploration in unknown environments is a fundamental problem in reinforcement learning
and control. In this work, we study task-guided exploration and determine what precisely an …
and control. In this work, we study task-guided exploration and determine what precisely an …