An overview of multi-agent reinforcement learning from game theoretical perspective

Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org
Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Decision-focused learning: Foundations, state of the art, benchmark and future opportunities

J Mandi, J Kotary, S Berden, M Mulamba… - Journal of Artificial …, 2024 - jair.org
Decision-focused learning (DFL) is an emerging paradigm that integrates machine learning
(ML) and constrained optimization to enhance decision quality by training ML models in an …

Principled reinforcement learning with human feedback from pairwise or k-wise comparisons

B Zhu, M Jordan, J Jiao - International Conference on …, 2023 - proceedings.mlr.press
We provide a theoretical framework for Reinforcement Learning with Human Feedback
(RLHF). We show that when the underlying true reward is linear, under both Bradley-Terry …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Efficiently teaching an effective dense retriever with balanced topic aware sampling

S Hofstätter, SC Lin, JH Yang, J Lin… - Proceedings of the 44th …, 2021 - dl.acm.org
A vital step towards the widespread adoption of neural retrieval models is their resource
efficiency throughout the training, indexing and query workflows. The neural IR community …

Rocketqav2: A joint training method for dense passage retrieval and passage re-ranking

R Ren, Y Qu, J Liu, WX Zhao, Q She, H Wu… - arxiv preprint arxiv …, 2021 - arxiv.org
In various natural language processing tasks, passage retrieval and passage re-ranking are
two key procedures in finding and ranking relevant information. Since both the two …

[BUKU][B] Pretrained transformers for text ranking: Bert and beyond

J Lin, R Nogueira, A Yates - 2022 - books.google.com
The goal of text ranking is to generate an ordered list of texts retrieved from a corpus in
response to a query. Although the most common formulation of text ranking is search …

Identification of potent antimicrobial peptides via a machine-learning pipeline that mines the entire space of peptide sequences

J Huang, Y Xu, Y Xue, Y Huang, X Li, X Chen… - Nature Biomedical …, 2023 - nature.com
Systematically identifying functional peptides is difficult owing to the vast combinatorial
space of peptide sequences. Here we report a machine-learning pipeline that mines the …

Dcn v2: Improved deep & cross network and practical lessons for web-scale learning to rank systems

R Wang, R Shivanna, D Cheng, S Jain, D Lin… - Proceedings of the web …, 2021 - dl.acm.org
Learning effective feature crosses is the key behind building recommender systems.
However, the sparse and large feature space requires exhaustive search to identify effective …

Leveraging large language models in conversational recommender systems

L Friedman, S Ahuja, D Allen, Z Tan… - arxiv preprint arxiv …, 2023 - arxiv.org
A Conversational Recommender System (CRS) offers increased transparency and control to
users by enabling them to engage with the system through a real-time multi-turn dialogue …