Google Академія

S Khodadadian, P Sharma, G Joshi… - International …, 2022 - proceedings.mlr.press

Since reinforcement learning algorithms are notoriously data-intensive, the task of sampling
observations from the environment is usually split across multiple agents. However …

Зберегти Послатися Цитовано в 79 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

The blessing of heterogeneity in federated q-learning: Linear speedup and beyond

J Woo, G Joshi, Y Chi - International Conference on …, 2023 - proceedings.mlr.press

In this paper, we consider federated Q-learning, which aims to learn an optimal Q-function
by periodically aggregating local Q-estimates trained on local data alone. Focusing on …

Зберегти Послатися Цитовано в 28 джерелах Пов’язані статті Кількість версій: 10 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

The sample-communication complexity trade-off in federated q-learning

S Salgia, Y Chi - Advances in Neural Information …, 2025 - proceedings.neurips.cc

We consider the problem of Federated Q-learning, where $ M $ agents aim to collaboratively
learn the optimal Q-function of an unknown infinite horizon Markov Decision Process with …

Зберегти Послатися Цитовано в 3 джерелах Пов’язані статті Кількість версій: 6 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Distributed momentum-based Frank-Wolfe algorithm for stochastic optimization

J Hou, X Zeng, G Wang, J Sun… - IEEE/CAA Journal of …, 2022 - ieeexplore.ieee.org

This paper considers distributed stochastic optimization, in which a number of agents
cooperate to optimize a global objective function through local computations and information …

Зберегти Послатися Цитовано в 26 джерелах Пов’язані статті Кількість версій: 9

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Sample and communication-efficient decentralized actor-critic algorithms with finite-time analysis

Z Chen, Y Zhou, RR Chen… - … Conference on Machine …, 2022 - proceedings.mlr.press

Actor-critic (AC) algorithms have been widely used in decentralized multi-agent systems to
learn the optimal joint control policy. However, existing decentralized AC algorithms either …

Зберегти Послатися Цитовано в 34 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Federated q-learning: Linear regret speedup with low communication cost

Z Zheng, F Gao, L Xue, J Yang - arxiv preprint arxiv:2312.15023, 2023 - arxiv.org

In this paper, we consider federated reinforcement learning for tabular episodic Markov
Decision Processes (MDP) where, under the coordination of a central server, multiple …

Зберегти Послатися Цитовано в 12 джерелах Пов’язані статті Кількість версій: 7 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Taming communication and sample complexities in decentralized policy evaluation for cooperative multi-agent reinforcement learning

X Zhang, Z Liu, J Liu, Z Zhu… - Advances in Neural …, 2021 - proceedings.neurips.cc

Cooperative multi-agent reinforcement learning (MARL) has received increasing attention in
recent years and has found many scientific and engineering applications. However, a key …

Зберегти Послатися Цитовано в 31 джерелах Пов’язані статті Кількість версій: 8 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Federated Q-learning with reference-advantage decomposition: almost optimal regret and logarithmic communication cost

Z Zheng, H Zhang, L Xue - arxiv preprint arxiv:2405.18795, 2024 - arxiv.org

In this paper, we consider model-free federated reinforcement learning for tabular episodic
Markov decision processes. Under the coordination of a central server, multiple agents …

Зберегти Послатися Цитовано в 5 джерелах Пов’язані статті Кількість версій: 3 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Central limit theorem for two-timescale stochastic approximation with markovian noise: Theory and applications

J Hu, V Doshi - International Conference on Artificial …, 2024 - proceedings.mlr.press

Two-timescale stochastic approximation (TTSA) is among the most general frameworks for
iterative stochastic algorithms. This includes well-known stochastic optimization methods …

Зберегти Послатися Цитовано в 5 джерелах Пов’язані статті Кількість версій: 4 Показати у форматі HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Distributed TD (0) with almost no communication

R Liu, A Olshevsky - IEEE Control Systems Letters, 2023 - ieeexplore.ieee.org

We provide a new non-asymptotic analysis of distributed temporal difference learning with
linear function approximation. Our approach relies on “one-shot averaging,” where N agents …

Зберегти Послатися Цитовано в 21 джерелах Пов’язані статті Кількість версій: 6

Створити сповіщення

Послатися

Розширений пошук

Збережено в моїй бібліотеці

Decentralized TD tracking with linear function approximation and its finite-time analysis

Federated reinforcement learning: Linear speedup under markovian sampling

The blessing of heterogeneity in federated q-learning: Linear speedup and beyond

The sample-communication complexity trade-off in federated q-learning

Distributed momentum-based Frank-Wolfe algorithm for stochastic optimization

Sample and communication-efficient decentralized actor-critic algorithms with finite-time analysis

Federated q-learning: Linear regret speedup with low communication cost

Taming communication and sample complexities in decentralized policy evaluation for cooperative multi-agent reinforcement learning

Federated Q-learning with reference-advantage decomposition: almost optimal regret and logarithmic communication cost

Central limit theorem for two-timescale stochastic approximation with markovian noise: Theory and applications

Distributed TD (0) with almost no communication