- Academic Search

Policy gradient methods for reinforcement learning with function approximation

Vyhledávat v článcích obsahujících odkaz

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Reinforcement learning algorithms: A brief survey

AK Shakya, G Pillai, S Chakrabarty - Expert Systems with Applications, 2023 - Elsevier

Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …

Uložit Citovat Počet citací tohoto článku: 221 Související články Všechny verze (počet: 2)

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Toward autonomous multi-UAV wireless network: A survey of reinforcement learning-based approaches

Y Bai, H Zhao, X Zhang, Z Chang… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org

Unmanned aerial vehicle (UAV)-based wireless networks have received increasing
research interest in recent years and are gradually being utilized in various aspects of our …

Uložit Citovat Počet citací tohoto článku: 104 Související články Všechny verze (počet: 4)

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Rlaif: Scaling reinforcement learning from human feedback with ai feedback

H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard… - 2023 - openreview.net

Reinforcement learning from human feedback (RLHF) is an effective technique for aligning
large language models (LLMs) to human preferences, but gathering high-quality human …

Uložit Citovat Počet citací tohoto článku: 477 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

MiniLLM: Knowledge distillation of large language models

Y Gu, L Dong, F Wei, M Huang - arxiv preprint arxiv:2306.08543, 2023 - arxiv.org

Knowledge Distillation (KD) is a promising technique for reducing the high computational
demand of large language models (LLMs). However, previous KD methods are primarily …

Uložit Citovat Počet citací tohoto článku: 298 Související články Všechny verze (počet: 4) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arxiv preprint arxiv …, 2023 - arxiv.org

Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

Uložit Citovat Počet citací tohoto článku: 251 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Coderl: Mastering code generation through pretrained models and deep reinforcement learning

H Le, Y Wang, AD Gotmare… - Advances in Neural …, 2022 - proceedings.neurips.cc

Program synthesis or code generation aims to generate a program that satisfies a problem
specification. Recent approaches using large-scale pretrained language models (LMs) have …

Uložit Citovat Počet citací tohoto článku: 337 Související články Všechny verze (počet: 7) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] tandfonline.com

A review on reinforcement learning algorithms and applications in supply chain management

B Rolf, I Jackson, M Müller, S Lang… - … Journal of Production …, 2023 - Taylor & Francis

Decision-making in supply chains is challenged by high complexity, a combination of
continuous and discrete processes, integrated and interdependent operations, dynamics …

Uložit Citovat Počet citací tohoto článku: 189 Související články Všechny verze (počet: 7)

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Deep reinforcement learning in smart manufacturing: A review and prospects

C Li, P Zheng, Y Yin, B Wang, L Wang - CIRP Journal of Manufacturing …, 2023 - Elsevier

To facilitate the personalized smart manufacturing paradigm with cognitive automation
capabilities, Deep Reinforcement Learning (DRL) has attracted ever-increasing attention by …

Uložit Citovat Počet citací tohoto článku: 196 Související články Všechny verze (počet: 6)

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Uložit Citovat Počet citací tohoto článku: 134 Související články Všechny verze (počet: 6) Zobrazit jako HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Attention mechanisms in computer vision: A survey

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer

Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …

Uložit Citovat Počet citací tohoto článku: 1935 Související články Všechny verze (počet: 10)

Vytvořit upozornění

Citovat

Rozšířené vyhledávání

Uloženo do Mojí knihovny

Policy gradient methods for reinforcement learning with function approximation

Reinforcement learning algorithms: A brief survey

Toward autonomous multi-UAV wireless network: A survey of reinforcement learning-based approaches

Rlaif: Scaling reinforcement learning from human feedback with ai feedback

MiniLLM: Knowledge distillation of large language models

Training diffusion models with reinforcement learning

Coderl: Mastering code generation through pretrained models and deep reinforcement learning

A review on reinforcement learning algorithms and applications in supply chain management

Deep reinforcement learning in smart manufacturing: A review and prospects

Efficient large language models: A survey

Attention mechanisms in computer vision: A survey