Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations

J Lin, X Ma, SC Lin, JH Yang, R Pradeep… - Proceedings of the 44th …, 2021 - dl.acm.org
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and
dense representations. It aims to provide effective, reproducible, and easy-to-use first-stage …

Anserini: Enabling the use of lucene for information retrieval research

P Yang, H Fang, J Lin - Proceedings of the 40th international ACM SIGIR …, 2017 - dl.acm.org
Software toolkits play an essential role in information retrieval research. Most open-source
toolkits developed by academics are designed to facilitate the evaluation of retrieval models …

Anserini: Reproducible ranking baselines using Lucene

P Yang, H Fang, J Lin - Journal of Data and Information Quality (JDIQ), 2018 - dl.acm.org
This work tackles the perennial problem of reproducible baselines in information retrieval
research, focusing on bag-of-words ranking models. Although academic information …

Improving accountability in recommender systems research through reproducibility

A Bellogín, A Said - User Modeling and User-Adapted Interaction, 2021 - Springer
Reproducibility is a key requirement for scientific progress. It allows the reproduction of the
works of others, and, as a consequence, to fully trust the reported claims and results. In this …

Increasing Reproducibility in IR: Findings from the Dagstuhl Seminar on" Reproducibility of Data-Oriented Experiments in e-Science"

N Ferro, N Fuhr, K Järvelin, N Kando, M Lippold… - ACM SIGIR Forum, 2016 - dl.acm.org
The Dagstuhl Seminar on" Reproducibility of Data-Oriented Experiments in e-Science", held
on 24-29 January 2016, focused on the core issues and approaches to reproducibility of …

The neural hype and comparisons against weak baselines

J Lin - Acm sigir forum, 2019 - dl.acm.org
Recently, the machine learning community paused in a moment of self-reflection. In a
widelydiscussed paper at ICLR 2018, Sculley et al.[13] wrote:" We observe that the rate of …

Pyserini: An easy-to-use python toolkit to support replicable ir research with sparse and dense representations

J Lin, X Ma, SC Lin, JH Yang, R Pradeep… - arxiv preprint arxiv …, 2021 - arxiv.org
Pyserini is an easy-to-use Python toolkit that supports replicable IR research by providing
effective first-stage retrieval in a multi-stage ranking architecture. Our toolkit is self-contained …

[HTML][HTML] An in-depth investigation on the behavior of measures to quantify reproducibility

M Maistro, T Breuer, P Schaer, N Ferro - Information Processing & …, 2023 - Elsevier
Science is facing a so-called reproducibility crisis, where researchers struggle to repeat
experiments and to get the same or comparable results. This represents a fundamental …

Reproducibility challenges in information retrieval evaluation

N Ferro - Journal of Data and Information Quality (JDIQ), 2017 - dl.acm.org
Information Retrieval (IR) is concerned with ranking information resources with respect to
user information needs, delivering a wide range of key applications for industry and society …

How to measure the reproducibility of system-oriented IR experiments

T Breuer, N Ferro, N Fuhr, M Maistro, T Sakai… - Proceedings of the 43rd …, 2020 - dl.acm.org
Replicability and reproducibility of experimental results are primary concerns in all the areas
of science and IR is not an exception. Besides the problem of moving the field towards more …