Reinforcement learning algorithms: A brief survey

AK Shakya, G Pillai, S Chakrabarty - Expert Systems with Applications, 2023 - Elsevier
Reinforcement Learning (RL) is a machine learning (ML) technique to learn sequential
decision-making in complex problems. RL is inspired by trial-and-error based human/animal …

Toward autonomous multi-UAV wireless network: A survey of reinforcement learning-based approaches

Y Bai, H Zhao, X Zhang, Z Chang… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
Unmanned aerial vehicle (UAV)-based wireless networks have received increasing
research interest in recent years and are gradually being utilized in various aspects of our …

Rlaif: Scaling reinforcement learning from human feedback with ai feedback

H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard… - 2023 - openreview.net
Reinforcement learning from human feedback (RLHF) is an effective technique for aligning
large language models (LLMs) to human preferences, but gathering high-quality human …

MiniLLM: Knowledge distillation of large language models

Y Gu, L Dong, F Wei, M Huang - arxiv preprint arxiv:2306.08543, 2023 - arxiv.org
Knowledge Distillation (KD) is a promising technique for reducing the high computational
demand of large language models (LLMs). However, previous KD methods are primarily …

Training diffusion models with reinforcement learning

K Black, M Janner, Y Du, I Kostrikov… - arxiv preprint arxiv …, 2023 - arxiv.org
Diffusion models are a class of flexible generative models trained with an approximation to
the log-likelihood objective. However, most use cases of diffusion models are not concerned …

Coderl: Mastering code generation through pretrained models and deep reinforcement learning

H Le, Y Wang, AD Gotmare… - Advances in Neural …, 2022 - proceedings.neurips.cc
Program synthesis or code generation aims to generate a program that satisfies a problem
specification. Recent approaches using large-scale pretrained language models (LMs) have …

A review on reinforcement learning algorithms and applications in supply chain management

B Rolf, I Jackson, M Müller, S Lang… - … Journal of Production …, 2023 - Taylor & Francis
Decision-making in supply chains is challenged by high complexity, a combination of
continuous and discrete processes, integrated and interdependent operations, dynamics …

Deep reinforcement learning in smart manufacturing: A review and prospects

C Li, P Zheng, Y Yin, B Wang, L Wang - CIRP Journal of Manufacturing …, 2023 - Elsevier
To facilitate the personalized smart manufacturing paradigm with cognitive automation
capabilities, Deep Reinforcement Learning (DRL) has attracted ever-increasing attention by …

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Attention mechanisms in computer vision: A survey

MH Guo, TX Xu, JJ Liu, ZN Liu, PT Jiang, TJ Mu… - Computational visual …, 2022 - Springer
Humans can naturally and effectively find salient regions in complex scenes. Motivated by
this observation, attention mechanisms were introduced into computer vision with the aim of …