Google Académico

K Sharma, YC Lee, S Nambi, A Salian, S Shah… - ACM Computing …, 2024 - dl.acm.org

Social recommender systems (SocialRS) simultaneously leverage the user-to-item
interactions as well as the user-to-user social relations for the task of generating item …

Guardar Citar Citado por 72 Artículos relacionados Las 3 versiones

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Application of machine learning in wireless networks: Key techniques and open issues

Y Sun, M Peng, Y Zhou, Y Huang… - … Surveys & Tutorials, 2019 - ieeexplore.ieee.org

As a key technique for enabling artificial intelligence, machine learning (ML) is capable of
solving complex problems without explicit programming. Motivated by its successful …

Guardar Citar Citado por 699 Artículos relacionados Las 9 versiones

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Automatic prompt optimization with" gradient descent" and beam search

R Pryzant, D Iter, J Li, YT Lee, C Zhu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have shown impressive performance as general purpose
agents, but their abilities remain highly dependent on prompts which are hand written with …

Guardar Citar Citado por 274 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

User-friendly introduction to PAC-Bayes bounds

P Alquier - Foundations and Trends® in Machine Learning, 2024 - nowpublishers.com

Aggregated predictors are obtained by making a set of basic predictors vote according to
some weights, that is, to some probability distribution. Randomized predictors are obtained …

Guardar Citar Citado por 217 Artículos relacionados Las 6 versiones Búsqueda de bibliotecas Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Provably efficient reinforcement learning with linear function approximation

C **, Z Yang, Z Wang… - Conference on learning …, 2020 - proceedings.mlr.press

Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …

Guardar Citar Citado por 779 Artículos relacionados Las 4 versiones Versión en HTML

[LIBRO][B] Control systems and reinforcement learning

S Meyn - 2022 - books.google.com

A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …

Guardar Citar Citado por 158 Artículos relacionados Las 3 versiones Búsqueda de bibliotecas

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Neural thompson sampling

W Zhang, D Zhou, L Li, Q Gu - arxiv preprint arxiv:2010.00827, 2020 - arxiv.org

Thompson Sampling (TS) is one of the most effective algorithms for solving contextual multi-
armed bandit problems. In this paper, we propose a new algorithm, called Neural Thompson …

Guardar Citar Citado por 282 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Is Q-learning provably efficient?

C **, Z Allen-Zhu, S Bubeck… - Advances in neural …, 2018 - proceedings.neurips.cc

Abstract Model-free reinforcement learning (RL) algorithms directly parameterize and
update value functions or policies, bypassing the modeling of the environment. They are …

Guardar Citar Citado por 1022 Artículos relacionados Las 7 versiones Versión en HTML

[Free GPT-4]
[DeepSeek]

[PDF] tor-lattimore.com

[LIBRO][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Guardar Citar Citado por 3313 Artículos relacionados Las 9 versiones Búsqueda de bibliotecas

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-armed bandit-based client scheduling for federated learning

W **a, TQS Quek, K Guo, W Wen… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

By exploiting the computing power and local data of distributed clients, federated learning
(FL) features ubiquitous properties such as reduction of communication overhead and …

Guardar Citar Citado por 284 Artículos relacionados Las 5 versiones

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

A survey of graph neural networks for social recommender systems

Application of machine learning in wireless networks: Key techniques and open issues

Automatic prompt optimization with" gradient descent" and beam search

User-friendly introduction to PAC-Bayes bounds

Provably efficient reinforcement learning with linear function approximation

[LIBRO][B] Control systems and reinforcement learning

Neural thompson sampling

Is Q-learning provably efficient?

[LIBRO][B] Bandit algorithms

Multi-armed bandit-based client scheduling for federated learning