Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
An overview of multi-agent reinforcement learning from game theoretical perspective
Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …
A modern introduction to online learning
F Orabona - arxiv preprint arxiv:1912.13213, 2019 - arxiv.org
In this monograph, I introduce the basic concepts of Online Learning through a modern view
of Online Convex Optimization. Here, online learning refers to the framework of regret …
of Online Convex Optimization. Here, online learning refers to the framework of regret …
[КНИГА][B] Partially observed Markov decision processes
V Krishnamurthy - 2016 - books.google.com
Covering formulation, algorithms, and structural results, and linking theory to real-world
applications in controlled sensing (including social learning, adaptive radars and sequential …
applications in controlled sensing (including social learning, adaptive radars and sequential …
Potential games
Potential Games Page 1 GAMES AND ECONOMIC BEHAVIOR 14, 124–143 (1996)
ARTICLE NO. 0044 Potential Games Dov Monderer ∗ Faculty of Industrial Engineering and …
ARTICLE NO. 0044 Potential Games Dov Monderer ∗ Faculty of Industrial Engineering and …
[КНИГА][B] Prediction, learning, and games
N Cesa-Bianchi, G Lugosi - 2006 - books.google.com
This important text and reference for researchers and students in machine learning, game
theory, statistics and information theory offers a comprehensive treatment of the problem of …
theory, statistics and information theory offers a comprehensive treatment of the problem of …
The nonstochastic multiarmed bandit problem
In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot
machines to play in a sequence of trials so as to maximize his reward. This classical …
machines to play in a sequence of trials so as to maximize his reward. This classical …
[PDF][PDF] Online convex programming and generalized infinitesimal gradient ascent
M Zinkevich - Proceedings of the 20th international conference on …, 2003 - cdn.aaai.org
Convex programming involves a convex set F⊆ Rn and a convex cost function c: F→ R. The
goal of convex programming is to find a point in F which minimizes c. In online convex …
goal of convex programming is to find a point in F which minimizes c. In online convex …
Learning in repeated auctions with budgets: Regret minimization and equilibrium
In online advertising markets, advertisers often purchase ad placements through bidding in
repeated auctions based on realized viewer information. We study how budget-constrained …
repeated auctions based on realized viewer information. We study how budget-constrained …
[КНИГА][B] Robustness
LP Hansen, TJ Sargent - 2008 - degruyter.com
The standard theory of decision making under uncertainty advises the decision maker to
form a statistical model linking outcomes to decisions and then to choose the optimal …
form a statistical model linking outcomes to decisions and then to choose the optimal …