Survey on applications of multi-armed and contextual bandits
In recent years, the multi-armed bandit (MAB) framework has attracted a lot of attention in
various applications, from recommender systems and information retrieval to healthcare and …
various applications, from recommender systems and information retrieval to healthcare and …
A survey on practical applications of multi-armed and contextual bandits
In recent years, multi-armed bandit (MAB) framework has attracted a lot of attention in
various applications, from recommender systems and information retrieval to healthcare and …
various applications, from recommender systems and information retrieval to healthcare and …
Multi-armed bandits in recommendation systems: A survey of the state-of-the-art and future directions
Abstract Recommender Systems (RSs) have assumed a crucial role in several digital
companies by directly affecting their key performance indicators. Nowadays, in this era of big …
companies by directly affecting their key performance indicators. Nowadays, in this era of big …
Teaching AI agents ethical values using reinforcement learning and policy orchestration
Autonomous cyber-physical agents play an increasingly large role in our lives. To ensure
that they behave in ways aligned with the values of society, we must develop techniques that …
that they behave in ways aligned with the values of society, we must develop techniques that …
Incorporating behavioral constraints in online AI systems
AI systems that learn through reward feedback about the actions they take are increasingly
deployed in domains that have significant impact on our daily life. However, in many cases …
deployed in domains that have significant impact on our daily life. However, in many cases …
Interactive reinforcement learning for feature selection with decision tree in the loop
We study the problem of balancing effectiveness and efficiency in automated feature
selection. Feature selection is to find an optimal feature subset from large feature space …
selection. Feature selection is to find an optimal feature subset from large feature space …
Optimal exploitation of clustering and history information in multi-armed bandit
We consider the stochastic multi-armed bandit problem and the contextual bandit problem
with historical observations and pre-clustered arms. The historical observations can contain …
with historical observations and pre-clustered arms. The historical observations can contain …
[BOOK][B] Adversarial Machine Learning: Attack Surfaces, Defence Mechanisms, Learning Theories in Artificial Intelligence
A significant robustness gap exists between machine intelligence and human perception
despite recent advances in deep learning. Deep learning is not provably secure. A critical …
despite recent advances in deep learning. Deep learning is not provably secure. A critical …
[PDF][PDF] Unified models of human behavioral agents in bandits, contextual bandits and rl
Artificial behavioral agents are often evaluated based on their consistent behaviors and
performance to take sequential actions in an environment to maximize some notion of …
performance to take sequential actions in an environment to maximize some notion of …
Using multi-armed bandits to learn ethical priorities for online AI systems
AI systems that learn through reward feedback about the actions they take are deployed in
domains that have significant impact on our daily life. However, in many cases the online …
domains that have significant impact on our daily life. However, in many cases the online …