- Academic Search

A Mahajan, T Rashid, M Samvelyan… - Advances in neural …, 2019 - proceedings.neurips.cc

Centralised training with decentralised execution is an important setting for cooperative
deep multi-agent reinforcement learning due to communication constraints during execution …

Speichern Zitieren Zitiert von: 447 Ähnliche Artikel Alle 11 Versionen HTML-Version

[Free GPT-4]

[PDF] mlr.press

Constrained variational policy optimization for safe reinforcement learning

Z Liu, Z Cen, V Isenbaev, W Liu, S Wu… - International …, 2022 - proceedings.mlr.press

Safe reinforcement learning (RL) aims to learn policies that satisfy certain constraints before
deploying them to safety-critical applications. Previous primal-dual style approaches suffer …

Speichern Zitieren Zitiert von: 90 Ähnliche Artikel Alle 6 Versionen HTML-Version

[Free GPT-4]

[PDF] frontiersin.org

Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities

A Gangwal, A Ansari, I Ahmad, AK Azad… - Frontiers in …, 2024 - frontiersin.org

There are two main ways to discover or design small drug molecules. The first involves fine-
tuning existing molecules or commercially successful drugs through quantitative structure …

Speichern Zitieren Zitiert von: 37 Ähnliche Artikel Alle 7 Versionen Im Cache

[Free GPT-4]

[PDF] neurips.cc

Deep active inference agents using Monte-Carlo methods

Z Fountas, N Sajid, P Mediano… - Advances in neural …, 2020 - proceedings.neurips.cc

Active inference is a Bayesian framework for understanding biological intelligence. The
underlying theory brings together perception and action under one single imperative …

Speichern Zitieren Zitiert von: 126 Ähnliche Artikel Alle 10 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Deep active inference as variational policy gradients

B Millidge - Journal of Mathematical Psychology, 2020 - Elsevier

Active Inference is a theory arising from theoretical neuroscience which casts action and
planning as Bayesian inference problems to be solved by minimizing a single quantity—the …

Speichern Zitieren Zitiert von: 126 Ähnliche Artikel Alle 6 Versionen

[Free GPT-4]

[PDF] neurips.cc

Leverage the average: an analysis of kl regularization in reinforcement learning

N Vieillard, T Kozuno, B Scherrer… - Advances in …, 2020 - proceedings.neurips.cc

Abstract Recent Reinforcement Learning (RL) algorithms making use of Kullback-Leibler
(KL) regularization as a core component have shown outstanding performance. Yet, only …

Speichern Zitieren Zitiert von: 85 Ähnliche Artikel Alle 8 Versionen HTML-Version

[Free GPT-4]

[PDF] neurips.cc

Posterior sampling with delayed feedback for reinforcement learning with linear function approximation

NL Kuang, M Yin, M Wang… - Advances in Neural …, 2023 - proceedings.neurips.cc

Recent studies in reinforcement learning (RL) have made significant progress by leveraging
function approximation to alleviate the sample complexity hurdle for better performance …

Speichern Zitieren Zitiert von: 8 Ähnliche Artikel Alle 5 Versionen HTML-Version

Adversarial Binaries: AI-guided Instrumentation Methods for Malware Detection Evasion

L Koch, E Begoli - ACM Computing Surveys, 2025 - dl.acm.org

Adversarial binaries are executable files that have been altered without loss of function by
an AI agent in order to deceive malware detection systems. Progress in this emergent vein of …

Speichern Zitieren Ähnliche Artikel

[Free GPT-4]

[PDF] arxiv.org

Iterated reasoning with mutual information in cooperative and byzantine decentralized teaming

S Konan, E Seraj, M Gombolay - arxiv preprint arxiv:2201.08484, 2022 - arxiv.org

Information sharing is key in building team cognition and enables coordination and
cooperation. High-performing human teams also benefit from acting strategically with …

Speichern Zitieren Zitiert von: 40 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] neurips.cc

Coherent soft imitation learning

J Watson, S Huang, N Heess - Advances in Neural …, 2024 - proceedings.neurips.cc

Imitation learning methods seek to learn from an expert either through behavioral cloning
(BC) for the policy or inverse reinforcement learning (IRL) for the reward. Such methods …

Speichern Zitieren Zitiert von: 6 Ähnliche Artikel Alle 7 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Virel: A variational inference framework for reinforcement learning

Maven: Multi-agent variational exploration

Constrained variational policy optimization for safe reinforcement learning

Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities

Deep active inference agents using Monte-Carlo methods

Deep active inference as variational policy gradients

Leverage the average: an analysis of kl regularization in reinforcement learning

Posterior sampling with delayed feedback for reinforcement learning with linear function approximation

Adversarial Binaries: AI-guided Instrumentation Methods for Malware Detection Evasion

Iterated reasoning with mutual information in cooperative and byzantine decentralized teaming

Coherent soft imitation learning