Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Information retrieval: recent advances and beyond

KA Hambarde, H Proenca - IEEE Access, 2023 - ieeexplore.ieee.org
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …

Exploring the limits of weakly supervised pretraining

D Mahajan, R Girshick… - Proceedings of the …, 2018 - openaccess.thecvf.com
State-of-the-art visual perception models for a wide range of tasks rely on supervised
pretraining. ImageNet classification is the de facto pretraining task for these models. Yet …

Learning vector-quantized item representation for transferable sequential recommenders

Y Hou, Z He, J McAuley, WX Zhao - … of the ACM Web Conference 2023, 2023 - dl.acm.org
Recently, the generality of natural language text has been leveraged to develop transferable
recommender systems. The basic idea is to employ pre-trained language models (PLM) to …

Accelerating large-scale inference with anisotropic vector quantization

R Guo, P Sun, E Lindgren, Q Geng… - International …, 2020 - proceedings.mlr.press
Quantization based techniques are the current state-of-the-art for scaling maximum inner
product search to massive databases. Traditional approaches to quantization aim to …

Billion-scale similarity search with GPUs

J Johnson, M Douze, H Jégou - IEEE Transactions on Big Data, 2019 - ieeexplore.ieee.org
Similarity search finds application in database systems handling complex data such as
images or videos, which are typically represented by high-dimensional features and require …

Milvus: A purpose-built vector data management system

J Wang, X Yi, R Guo, H **, P Xu, S Li, X Wang… - Proceedings of the …, 2021 - dl.acm.org
Recently, there has been a pressing need to manage high-dimensional vector data in data
science and AI applications. This trend is fueled by the proliferation of unstructured data and …

Nonparametric masked language modeling

S Min, W Shi, M Lewis, X Chen, W Yih… - arxiv preprint arxiv …, 2022 - arxiv.org
Existing language models (LMs) predict tokens with a softmax over a finite vocabulary,
which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first …

Sketch-based manga retrieval using manga109 dataset

Y Matsui, K Ito, Y Aramaki, A Fujimoto, T Ogawa… - Multimedia tools and …, 2017 - Springer
Manga (Japanese comics) are popular worldwide. However, current e-manga archives offer
very limited search support, ie, keyword-based search by title or author. To make the manga …

Accelerating very deep convolutional networks for classification and detection

X Zhang, J Zou, K He, J Sun - IEEE transactions on pattern …, 2015 - ieeexplore.ieee.org
This paper aims to accelerate the test-time computation of convolutional neural networks
(CNNs), especially very deep CNNs that have substantially impacted the computer vision …