Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications
Abstract The k-Nearest Neighbors (kNN) method, established in 1951, has since evolved
into a pivotal tool in data mining, recommendation systems, and Internet of Things (IoT) …
into a pivotal tool in data mining, recommendation systems, and Internet of Things (IoT) …
Orchestrating single-cell analysis with Bioconductor
Recent technological advancements have enabled the profiling of a large number of
genome-wide features in individual cells. However, single-cell data present unique …
genome-wide features in individual cells. However, single-cell data present unique …
Dense text retrieval based on pretrained language models: A survey
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …
required to return relevant information resources to user's queries in natural language. From …
Survey of vector database management systems
There are now over 20 commercial vector database management systems (VDBMSs), all
produced within the past five years. But embedding-based retrieval has been studied for …
produced within the past five years. But embedding-based retrieval has been studied for …
Milvus: A purpose-built vector data management system
Recently, there has been a pressing need to manage high-dimensional vector data in data
science and AI applications. This trend is fueled by the proliferation of unstructured data and …
science and AI applications. This trend is fueled by the proliferation of unstructured data and …
A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search
Approximate nearest neighbor search (ANNS) constitutes an important operation in a
multitude of applications, including recommendation systems, information retrieval, and …
multitude of applications, including recommendation systems, information retrieval, and …
Pre-training methods in information retrieval
The core of information retrieval (IR) is to identify relevant information from large-scale
resources and return it as a ranked list to respond to user's information need. In recent years …
resources and return it as a ranked list to respond to user's information need. In recent years …
Multi-modal hashing for efficient multimedia retrieval: A survey
With the explosive growth of multimedia contents, multimedia retrieval is facing
unprecedented challenges on both storage cost and retrieval speed. Hashing technique can …
unprecedented challenges on both storage cost and retrieval speed. Hashing technique can …
Semantic models for the first-stage retrieval: A comprehensive review
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …
where the first-stage retrieval is to return a subset of candidate documents and latter stages …
Fast approximate nearest neighbor search with the navigating spreading-out graph
Approximate nearest neighbor search (ANNS) is a fundamental problem in databases and
data mining. A scalable ANNS algorithm should be both memory-efficient and fast. Some …
data mining. A scalable ANNS algorithm should be both memory-efficient and fast. Some …