Pyserini: A Python toolkit for reproducible information retrieval research with sparse and dense representations
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and
dense representations. It aims to provide effective, reproducible, and easy-to-use first-stage …
dense representations. It aims to provide effective, reproducible, and easy-to-use first-stage …
Conversational information seeking
Conversational information seeking (CIS) is concerned with a sequence of interactions
between one or more users and an information system. Interactions in CIS are primarily …
between one or more users and an information system. Interactions in CIS are primarily …
How does generative retrieval scale to millions of passages?
Popularized by the Differentiable Search Index, the emerging paradigm of generative
retrieval re-frames the classic information retrieval problem into a sequence-to-sequence …
retrieval re-frames the classic information retrieval problem into a sequence-to-sequence …
RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze!
In information retrieval, proprietary large language models (LLMs) such as GPT-4 and open-
source counterparts such as LLaMA and Vicuna have played a vital role in reranking …
source counterparts such as LLaMA and Vicuna have played a vital role in reranking …
Exploring listwise evidence reasoning with t5 for fact verification
This work explores a framework for fact verification that leverages pretrained sequence-to-
sequence transformer models for sentence selection and label prediction, two key sub-tasks …
sequence transformer models for sentence selection and label prediction, two key sub-tasks …
Squeezing water from a stone: a bag of tricks for further improving cross-encoder effectiveness for reranking
While much recent work has demonstrated that hard negative mining can be used to train
better bi-encoder models, few have considered it in the context of cross-encoders, which are …
better bi-encoder models, few have considered it in the context of cross-encoders, which are …
[PDF][PDF] Overview of the TREC 2020 Health Misinformation Track.
TREC 2021 was the third year for the Health Misinformation track, which was named the
Decision Track in 2019 [1]. In 2021, the track had an ad-hoc retrieval task. In each year, the …
Decision Track in 2019 [1]. In 2021, the track had an ad-hoc retrieval task. In each year, the …
Everything we hear: Towards tackling misinformation in podcasts
Advances in generative AI, the proliferation of large multimodal models (LMMs), and
democratized open access to these technologies have direct implications for the production …
democratized open access to these technologies have direct implications for the production …
Read it twice: Towards faithfully interpretable fact verification by revisiting evidence
Real-world fact verification task aims to verify the factuality of a claim by retrieving evidence
from the source document. The quality of the retrieved evidence plays an important role in …
from the source document. The quality of the retrieved evidence plays an important role in …
Neural query synthesis and domain-specific ranking templates for multi-stage clinical trial matching
In this work, we propose an effective multi-stage neural ranking system for the clinical trial
matching problem. First, we introduce NQS, a neural query synthesis method that leverages …
matching problem. First, we introduce NQS, a neural query synthesis method that leverages …