Bayesian reinforcement learning: A survey

M Ghavamzadeh, S Mannor, J Pineau… - … and Trends® in …, 2015 - nowpublishers.com
Bayesian methods for machine learning have been widely investigated, yielding principled
methods for incorporating prior information into inference algorithms. In this survey, we …

A unified recipe for deriving (time-uniform) PAC-Bayes bounds

B Chugg, H Wang, A Ramdas - Journal of Machine Learning Research, 2023 - jmlr.org
We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike
most previous literature on this topic, our bounds are anytime-valid (ie, time-uniform) …

[PDF][PDF] Fast rates in statistical and online learning

T Van Erven, PD Grünwald, NA Mehta, MD Reid… - The Journal of Machine …, 2015 - jmlr.org
The speed with which a learning algorithm converges as it is presented with more data is a
central problem in machine learning—a fast rate of convergence means less data is needed …

[PDF][PDF] Bayesian nonparametric covariance regression

EB Fox, DB Dunson - The Journal of Machine Learning Research, 2015 - jmlr.org
Capturing predictor-dependent correlations amongst the elements of a multivariate
response vector is fundamental to numerous applied domains, including neuroscience …

PAC-Bayesian lifelong learning for multi-armed bandits

H Flynn, D Reeb, M Kandemir, J Peters - Data Mining and Knowledge …, 2022 - Springer
We present a PAC-Bayesian analysis of lifelong learning. In the lifelong learning problem, a
sequence of learning tasks is observed one-at-a-time, and the goal is to transfer information …

PAC-Bayesian soft actor-critic learning

B Tasdighi, A Akgül, M Haussmann, KK Brink… - arxiv preprint arxiv …, 2023 - arxiv.org
Actor-critic algorithms address the dual goals of reinforcement learning (RL), policy
evaluation and improvement via two separate function approximators. The practicality of this …

[PDF][PDF] Policy learning for domain selection in an extensible multi-domain spoken dialogue system

Z Wang, H Chen, G Wang, H Tian, H Wu… - Proceedings of the …, 2014 - aclanthology.org
This paper proposes a Markov Decision Process and reinforcement learning based
approach for domain selection in a multidomain Spoken Dialogue System built on a …

PAC-Bayes control: learning policies that provably generalize to novel environments

A Majumdar, A Farid, A Sonar - The International Journal of …, 2021 - journals.sagepub.com
Our goal is to learn control policies for robots that provably generalize well to novel
environments given a dataset of example environments. The key technical idea behind our …