Algorithmic and human collusion

T Werner - Available at SSRN 3960738, 2024 - papers.ssrn.com
I study self-learning pricing algorithms and show that they are collusive in market
simulations. To derive a counterfactual that resembles traditional tacit collusion, I conduct …

Aligning diffusion behaviors with q-functions for efficient continuous control

H Chen, K Zheng, H Su, J Zhu - arxiv preprint arxiv:2407.09024, 2024 - arxiv.org
Drawing upon recent advances in language model alignment, we formulate offline
Reinforcement Learning as a two-stage optimization problem: First pretraining expressive …

Hybrid reinforcement learning from offline observation alone

Y Song, JA Bagnell, A Singh - arxiv preprint arxiv:2406.07253, 2024 - arxiv.org
We consider the hybrid reinforcement learning setting where the agent has access to both
offline data and online interactive access. While Reinforcement Learning (RL) research …

Perspectives for Direct Interpretability in Multi-Agent Deep Reinforcement Learning

Y Poupart, A Beynier, N Maudet - arxiv preprint arxiv:2502.00726, 2025 - arxiv.org
Multi-Agent Deep Reinforcement Learning (MADRL) was proven efficient in solving complex
problems in robotics or games, yet most of the trained models are hard to interpret. While …

Scrutinize What We Ignore: Reining In Task Representation Shift Of Context-Based Offline Meta Reinforcement Learning

H Zhang, B Zheng, T Ji, J Liu, A Guo, J Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org
Offline meta reinforcement learning (OMRL) has emerged as a promising approach for
interaction avoidance and strong generalization performance by leveraging pre-collected …

Efficient policy evaluation with offline data informed behavior policy design

S Liu, S Zhang - arxiv preprint arxiv:2301.13734, 2023 - arxiv.org
Most reinforcement learning practitioners evaluate their policies with online Monte Carlo
estimators for either hyperparameter tuning or testing different algorithmic design choices …

Offline Fictitious Self-Play for Competitive Games

J Chen, W **e, W Zhang, Y Wen - arxiv preprint arxiv:2403.00841, 2024 - arxiv.org
Offline Reinforcement Learning (RL) has received significant interest due to its ability to
improve policies in previously collected datasets without online interactions. Despite its …

Test-Fleet Optimization Using Machine Learning

A Datta, BV Yaganti, A Dove, A Peltz… - 2024 IEEE European …, 2024 - ieeexplore.ieee.org
We present a solution to the complex problem of scheduling test operations in a validation
lab or production facility. Our goal is to maximize the utilization of a fleet of test stations and …

[PDF][PDF] Quantum Intelligence: Responsible Human-AI Entities.

M Swan, RP dos Santos - AAAI Spring Symposium: SRAI, 2023 - ceur-ws.org
The increasing ability to harness quantum, classical, and relativistic scales, together with
fastpaced change in generative AI and quantum computing, suggests the possibility of …

Comparing Transfer Learning and Rollout for Policy Adaptation in a Changing Network Environment

FS Samani, H Larsson, S Damberg… - NOMS 2024-2024 …, 2024 - ieeexplore.ieee.org
Dynamic resource allocation for network services is pivotal for achieving end-to-end
management objectives. Previous research has demonstrated that Reinforcement Learning …