Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications

RK Halder, MN Uddin, MA Uddin, S Aryal, A Khraisat - Journal of Big Data, 2024 - Springer
Abstract The k-Nearest Neighbors (kNN) method, established in 1951, has since evolved
into a pivotal tool in data mining, recommendation systems, and Internet of Things (IoT) …

Orchestrating single-cell analysis with Bioconductor

RA Amezquita, ATL Lun, E Becht, VJ Carey… - Nature …, 2020 - nature.com
Recent technological advancements have enabled the profiling of a large number of
genome-wide features in individual cells. However, single-cell data present unique …

Dense text retrieval based on pretrained language models: A survey

WX Zhao, J Liu, R Ren, JR Wen - ACM Transactions on Information …, 2024 - dl.acm.org
Text retrieval is a long-standing research topic on information seeking, where a system is
required to return relevant information resources to user's queries in natural language. From …

Survey of vector database management systems

JJ Pan, J Wang, G Li - The VLDB Journal, 2024 - Springer
There are now over 20 commercial vector database management systems (VDBMSs), all
produced within the past five years. But embedding-based retrieval has been studied for …

Milvus: A purpose-built vector data management system

J Wang, X Yi, R Guo, H **, P Xu, S Li, X Wang… - Proceedings of the …, 2021 - dl.acm.org
Recently, there has been a pressing need to manage high-dimensional vector data in data
science and AI applications. This trend is fueled by the proliferation of unstructured data and …

A comprehensive survey and experimental comparison of graph-based approximate nearest neighbor search

M Wang, X Xu, Q Yue, Y Wang - arxiv preprint arxiv:2101.12631, 2021 - arxiv.org
Approximate nearest neighbor search (ANNS) constitutes an important operation in a
multitude of applications, including recommendation systems, information retrieval, and …

Pre-training methods in information retrieval

Y Fan, X **e, Y Cai, J Chen, X Ma, X Li… - … and Trends® in …, 2022 - nowpublishers.com
The core of information retrieval (IR) is to identify relevant information from large-scale
resources and return it as a ranked list to respond to user's information need. In recent years …

Multi-modal hashing for efficient multimedia retrieval: A survey

L Zhu, C Zheng, W Guan, J Li, Y Yang… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
With the explosive growth of multimedia contents, multimedia retrieval is facing
unprecedented challenges on both storage cost and retrieval speed. Hashing technique can …

Semantic models for the first-stage retrieval: A comprehensive review

J Guo, Y Cai, Y Fan, F Sun, R Zhang… - ACM Transactions on …, 2022 - dl.acm.org
Multi-stage ranking pipelines have been a practical solution in modern search systems,
where the first-stage retrieval is to return a subset of candidate documents and latter stages …

Fast approximate nearest neighbor search with the navigating spreading-out graph

C Fu, C **ang, C Wang, D Cai - arxiv preprint arxiv:1707.00143, 2017 - arxiv.org
Approximate nearest neighbor search (ANNS) is a fundamental problem in databases and
data mining. A scalable ANNS algorithm should be both memory-efficient and fast. Some …