Semantic models for the first-stage retrieval: A comprehensive review
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …
where the first-stage retrieval is to return a subset of candidate documents and latter stages …
Information retrieval: recent advances and beyond
This paper provides an extensive and thorough overview of the models and techniques
utilized in the first and second stages of the typical information retrieval processing chain …
utilized in the first and second stages of the typical information retrieval processing chain …
Exploring the limits of weakly supervised pretraining
State-of-the-art visual perception models for a wide range of tasks rely on supervised
pretraining. ImageNet classification is the de facto pretraining task for these models. Yet …
pretraining. ImageNet classification is the de facto pretraining task for these models. Yet …
Learning vector-quantized item representation for transferable sequential recommenders
Recently, the generality of natural language text has been leveraged to develop transferable
recommender systems. The basic idea is to employ pre-trained language models (PLM) to …
recommender systems. The basic idea is to employ pre-trained language models (PLM) to …
Accelerating large-scale inference with anisotropic vector quantization
Quantization based techniques are the current state-of-the-art for scaling maximum inner
product search to massive databases. Traditional approaches to quantization aim to …
product search to massive databases. Traditional approaches to quantization aim to …
Billion-scale similarity search with GPUs
Similarity search finds application in database systems handling complex data such as
images or videos, which are typically represented by high-dimensional features and require …
images or videos, which are typically represented by high-dimensional features and require …
Milvus: A purpose-built vector data management system
Recently, there has been a pressing need to manage high-dimensional vector data in data
science and AI applications. This trend is fueled by the proliferation of unstructured data and …
science and AI applications. This trend is fueled by the proliferation of unstructured data and …
Nonparametric masked language modeling
Existing language models (LMs) predict tokens with a softmax over a finite vocabulary,
which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first …
which can make it difficult to predict rare tokens or phrases. We introduce NPM, the first …
Sketch-based manga retrieval using manga109 dataset
Y Matsui, K Ito, Y Aramaki, A Fujimoto, T Ogawa… - Multimedia tools and …, 2017 - Springer
Manga (Japanese comics) are popular worldwide. However, current e-manga archives offer
very limited search support, ie, keyword-based search by title or author. To make the manga …
very limited search support, ie, keyword-based search by title or author. To make the manga …
Accelerating very deep convolutional networks for classification and detection
This paper aims to accelerate the test-time computation of convolutional neural networks
(CNNs), especially very deep CNNs that have substantially impacted the computer vision …
(CNNs), especially very deep CNNs that have substantially impacted the computer vision …