Federated reinforcement learning: Linear speedup under markovian sampling

S Khodadadian, P Sharma, G Joshi… - International …, 2022 - proceedings.mlr.press
Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling
observations from the environment is usually split across multiple agents. However …

The blessing of heterogeneity in federated q-learning: Linear speedup and beyond

J Woo, G Joshi, Y Chi - International Conference on …, 2023 - proceedings.mlr.press
In this paper, we consider federated Q-learning, which aims to learn an optimal Q-function
by periodically aggregating local Q-estimates trained on local data alone. Focusing on …

The sample-communication complexity trade-off in federated q-learning

S Salgia, Y Chi - Advances in Neural Information …, 2025 - proceedings.neurips.cc
We consider the problem of Federated Q-learning, where $ M $ agents aim to collaboratively
learn the optimal Q-function of an unknown infinite horizon Markov Decision Process with …

Distributed momentum-based Frank-Wolfe algorithm for stochastic optimization

J Hou, X Zeng, G Wang, J Sun… - IEEE/CAA Journal of …, 2022 - ieeexplore.ieee.org
This paper considers distributed stochastic optimization, in which a number of agents
cooperate to optimize a global objective function through local computations and information …

Sample and communication-efficient decentralized actor-critic algorithms with finite-time analysis

Z Chen, Y Zhou, RR Chen… - … Conference on Machine …, 2022 - proceedings.mlr.press
Actor-critic (AC) algorithms have been widely used in decentralized multi-agent systems to
learn the optimal joint control policy. However, existing decentralized AC algorithms either …

Federated q-learning: Linear regret speedup with low communication cost

Z Zheng, F Gao, L Xue, J Yang - arxiv preprint arxiv:2312.15023, 2023 - arxiv.org
In this paper, we consider federated reinforcement learning for tabular episodic Markov
Decision Processes (MDP) where, under the coordination of a central server, multiple …

Taming communication and sample complexities in decentralized policy evaluation for cooperative multi-agent reinforcement learning

X Zhang, Z Liu, J Liu, Z Zhu… - Advances in Neural …, 2021 - proceedings.neurips.cc
Cooperative multi-agent reinforcement learning (MARL) has received increasing attention in
recent years and has found many scientific and engineering applications. However, a key …

Federated Q-learning with reference-advantage decomposition: almost optimal regret and logarithmic communication cost

Z Zheng, H Zhang, L Xue - arxiv preprint arxiv:2405.18795, 2024 - arxiv.org
In this paper, we consider model-free federated reinforcement learning for tabular episodic
Markov decision processes. Under the coordination of a central server, multiple agents …

Central limit theorem for two-timescale stochastic approximation with markovian noise: Theory and applications

J Hu, V Doshi - International Conference on Artificial …, 2024 - proceedings.mlr.press
Two-timescale stochastic approximation (TTSA) is among the most general frameworks for
iterative stochastic algorithms. This includes well-known stochastic optimization methods …

Distributed TD (0) with almost no communication

R Liu, A Olshevsky - IEEE Control Systems Letters, 2023 - ieeexplore.ieee.org
We provide a new non-asymptotic analysis of distributed temporal difference learning with
linear function approximation. Our approach relies on “one-shot averaging,” where N agents …