Survey on applications of multi-armed and contextual bandits

D Bouneffouf, I Rish, C Aggarwal - 2020 IEEE Congress on …, 2020 - ieeexplore.ieee.org
In recent years, the multi-armed bandit (MAB) framework has attracted a lot of attention in
various applications, from recommender systems and information retrieval to healthcare and …

A survey on practical applications of multi-armed and contextual bandits

D Bouneffouf, I Rish - arxiv preprint arxiv:1904.10040, 2019 - arxiv.org
In recent years, multi-armed bandit (MAB) framework has attracted a lot of attention in
various applications, from recommender systems and information retrieval to healthcare and …

Multi-armed bandits in recommendation systems: A survey of the state-of-the-art and future directions

N Silva, H Werneck, T Silva, ACM Pereira… - Expert Systems with …, 2022 - Elsevier
Abstract Recommender Systems (RSs) have assumed a crucial role in several digital
companies by directly affecting their key performance indicators. Nowadays, in this era of big …

Teaching AI agents ethical values using reinforcement learning and policy orchestration

R Noothigattu, D Bouneffouf, N Mattei… - IBM Journal of …, 2019 - ieeexplore.ieee.org
Autonomous cyber-physical agents play an increasingly large role in our lives. To ensure
that they behave in ways aligned with the values of society, we must develop techniques that …

Incorporating behavioral constraints in online AI systems

A Balakrishnan, D Bouneffouf, N Mattei… - Proceedings of the AAAI …, 2019 - ojs.aaai.org
AI systems that learn through reward feedback about the actions they take are increasingly
deployed in domains that have significant impact on our daily life. However, in many cases …

Interactive reinforcement learning for feature selection with decision tree in the loop

W Fan, K Liu, H Liu, Y Ge, H **ong… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
We study the problem of balancing effectiveness and efficiency in automated feature
selection. Feature selection is to find an optimal feature subset from large feature space …

Optimal exploitation of clustering and history information in multi-armed bandit

D Bouneffouf, S Parthasarathy, H Samulowitz… - arxiv preprint arxiv …, 2019 - arxiv.org
We consider the stochastic multi-armed bandit problem and the contextual bandit problem
with historical observations and pre-clustered arms. The historical observations can contain …

[BOOK][B] Adversarial Machine Learning: Attack Surfaces, Defence Mechanisms, Learning Theories in Artificial Intelligence

AS Chivukula, X Yang, B Liu, W Liu, W Zhou - 2023 - Springer
A significant robustness gap exists between machine intelligence and human perception
despite recent advances in deep learning. Deep learning is not provably secure. A critical …

[PDF][PDF] Unified models of human behavioral agents in bandits, contextual bandits and rl

B Lin, G Cecchi, D Bouneffouf, J Reinen… - arxiv preprint arxiv …, 2020 - researchgate.net
Artificial behavioral agents are often evaluated based on their consistent behaviors and
performance to take sequential actions in an environment to maximize some notion of …

Using multi-armed bandits to learn ethical priorities for online AI systems

A Balakrishnan, D Bouneffouf… - IBM Journal of …, 2019 - ieeexplore.ieee.org
AI systems that learn through reward feedback about the actions they take are deployed in
domains that have significant impact on our daily life. However, in many cases the online …