Google 학술 검색

Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org

Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

저장 인용 351회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[HTML] acm.org

Reinforcement learning: A tutorial survey and recent advances

A Gosavi - INFORMS Journal on Computing, 2009 - pubsonline.informs.org

In the last few years, reinforcement learning (RL), also called adaptive (or approximate)
dynamic programming, has emerged as a powerful tool for solving complex sequential …

저장 인용 452회 인용 관련 학술자료 전체 14개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Adversarially trained actor critic for offline reinforcement learning

CA Cheng, T **e, N Jiang… - … Conference on Machine …, 2022 - proceedings.mlr.press

Abstract We propose Adversarially Trained Actor Critic (ATAC), a new model-free algorithm
for offline reinforcement learning (RL) under insufficient data coverage, based on the …

저장 인용 150회 인용 관련 학술자료 전체 8개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A two-timescale stochastic algorithm framework for bilevel optimization: Complexity analysis and application to actor-critic

M Hong, HT Wai, Z Wang, Z Yang - SIAM Journal on Optimization, 2023 - SIAM

This paper analyzes a two-timescale stochastic algorithm framework for bilevel optimization.
Bilevel optimization is a class of problems which exhibits a two-level structure, and its goal is …

저장 인용 323회 인용 관련 학술자료 전체 6개의 버전

A two-level charging scheduling method for public electric vehicle charging stations considering heterogeneous demand and nonlinear charging profile

Z Zhao, CKM Lee, J Ren - Applied energy, 2024 - Elsevier

This paper investigates the electric vehicle (EV) charging scheduling problem for public EV
charging stations (EVCSs) that can accommodate heterogeneous charging demands …

저장 인용 38회 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Gans trained by a two time-scale update rule converge to a local nash equilibrium

M Heusel, H Ramsauer, T Unterthiner… - Advances in neural …, 2017 - proceedings.neurips.cc

Abstract Generative Adversarial Networks (GANs) excel at creating realistic images with
complex models for which maximum likelihood is infeasible. However, the convergence of …

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

SBEED: Convergent reinforcement learning with nonlinear function approximation

B Dai, A Shaw, L Li, L **ao, N He… - International …, 2018 - proceedings.mlr.press

When function approximation is used, solving the Bellman optimality equation with stability
guarantees has remained a major open problem in reinforcement learning for decades. The …

저장 인용 327회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Rudder: Return decomposition for delayed rewards

JA Arjona-Medina, M Gillhofer… - Advances in …, 2019 - proceedings.neurips.cc

We propose RUDDER, a novel reinforcement learning approach for delayed rewards in
finite Markov decision processes (MDPs). In MDPs the Q-values are equal to the expected …

저장 인용 273회 인용 관련 학술자료 전체 9개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Fedgan: Federated generative adversarial networks for distributed data

M Rasouli, T Sun, R Rajagopal - arxiv preprint arxiv:2006.07228, 2020 - arxiv.org

We propose Federated Generative Adversarial Network (FedGAN) for training a GAN across
distributed sources of non-independent-and-identically-distributed data sources subject to …

저장 인용 173회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Parametrized deep q-networks learning: Reinforcement learning with discrete-continuous hybrid action space

J **ong, Q Wang, Z Yang, P Sun, L Han… - arxiv preprint arxiv …, 2018 - arxiv.org

Most existing deep reinforcement learning (DRL) frameworks consider either discrete action
space or continuous action space solely. Motivated by applications in computer games, we …

저장 인용 240회 인용 관련 학술자료 전체 2개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Stochastic approximation with two time scales

An overview of multi-agent reinforcement learning from game theoretical perspective

Reinforcement learning: A tutorial survey and recent advances

Adversarially trained actor critic for offline reinforcement learning

A two-timescale stochastic algorithm framework for bilevel optimization: Complexity analysis and application to actor-critic

A two-level charging scheduling method for public electric vehicle charging stations considering heterogeneous demand and nonlinear charging profile

Gans trained by a two time-scale update rule converge to a local nash equilibrium

SBEED: Convergent reinforcement learning with nonlinear function approximation

Rudder: Return decomposition for delayed rewards

Fedgan: Federated generative adversarial networks for distributed data

Parametrized deep q-networks learning: Reinforcement learning with discrete-continuous hybrid action space