Google znalac

BJ Kagan, AC Kitchen, NT Tran, F Habibollahi… - Neuron, 2022 - cell.com

Integrating neurons into digital systems may enable performance infeasible with silicon
alone. Here, we develop DishBrain, a system that harnesses the inherent adaptive …

Spremi Citiraj Spominje se 259 puta Srodni članci Svih 21 inačica

[Free GPT-4]
[DeepSeek]

[PDF] ens.fr

[KNJIGA][B] Learning theory from first principles

F Bach - 2024 - books.google.com

A comprehensive and cutting-edge introduction to the foundations and modern applications
of learning theory. Research has exploded in the field of machine learning resulting in …

Spremi Citiraj Spominje se 131 puta Srodni članci Svih 7 inačica Pretraživanje knjižnica

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Model selection in contextual stochastic bandit problems

A Pacchiano, M Phan… - Advances in …, 2020 - proceedings.neurips.cc

We study bandit model selection in stochastic environments. Our approach relies on a
master algorithm that selects between candidate base algorithms. We develop a master …

Spremi Citiraj Spominje se 115 puta Srodni članci Svih 7 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Multi-armed bandit experimental design: Online decision-making and adaptive inference

D Simchi-Levi, C Wang - International Conference on …, 2023 - proceedings.mlr.press

Multi-armed bandit has been well-known for its efficiency in online decision-making in terms
of minimizing the loss of the participants' welfare during experiments (ie, the regret). In …

Spremi Citiraj Spominje se 38 puta Srodni članci Svih 3 inačica Pretraživanje knjižnica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

One practical algorithm for both stochastic and adversarial bandits

Y Seldin, A Slivkins - International Conference on Machine …, 2014 - proceedings.mlr.press

We present an algorithm for multiarmed bandits that achieves almost optimal performance in
both stochastic and adversarial regimes without prior knowledge about the nature of the …

Spremi Citiraj Spominje se 196 puta Srodni članci Svih 11 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Efficient online data mixing for language model pre-training

A Albalak, L Pan, C Raffel, WY Wang - arxiv preprint arxiv:2312.02406, 2023 - arxiv.org

The data used to pretrain large language models has a decisive impact on a model's
downstream performance, which has led to a large body of work on data selection methods …

Spremi Citiraj Spominje se 24 puta Srodni članci Svih 5 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] udel.edu

Resource optimization of MAB-based reputation management for data trading in vehicular edge computing

H **ao, L Cai, J Feng, Q Pei… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Vehicles are hesitant to upload data to edge servers in vehicle edge computing (VEC) as
many vehicle data collected and perceived by various on-board sensors contain sensitive …

Spremi Citiraj Spominje se 19 puta Srodni članci Svih 8 inačica

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Improving few-shot generalization by exploring and exploiting auxiliary data

A Albalak, CA Raffel, WY Wang - Advances in Neural …, 2023 - proceedings.neurips.cc

Few-shot learning is valuable in many real-world applications, but learning a generalizable
model without overfitting to the few labeled datapoints is challenging. In this work, we focus …

Spremi Citiraj Spominje se 11 puta Srodni članci Svih 8 inačica Prikaži kao HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Distributed online learning for coexistence in cognitive radar networks

WW Howard, AF Martone… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

This work addresses the coexistence problem for radar networks. Specifically, we model a
network of cooperative, independent, and non-communicating radar nodes which must …

Spremi Citiraj Spominje se 20 puta Srodni članci Svih 5 inačica

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Beyond variance reduction: Understanding the true impact of baselines on policy optimization

W Chung, V Thomas, MC Machado… - … on Machine Learning, 2021 - proceedings.mlr.press

Bandit and reinforcement learning (RL) problems can often be framed as optimization
problems where the goal is to maximize average performance while having access only to …

Spremi Citiraj Spominje se 31 puta Srodni članci Svih 4 inačica Prikaži kao HTML

Stvori obavijest

Citiraj

Napredno pretraživanje

Spremljeno u Moju knjižnicu

Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments

In vitro neurons learn and exhibit sentience when embodied in a simulated game-world

[KNJIGA][B] Learning theory from first principles

Model selection in contextual stochastic bandit problems

Multi-armed bandit experimental design: Online decision-making and adaptive inference

One practical algorithm for both stochastic and adversarial bandits

Efficient online data mixing for language model pre-training

Resource optimization of MAB-based reputation management for data trading in vehicular edge computing

Improving few-shot generalization by exploring and exploiting auxiliary data

Distributed online learning for coexistence in cognitive radar networks

Beyond variance reduction: Understanding the true impact of baselines on policy optimization