One-pass distribution sketch for measuring data heterogeneity in federated learning
Federated learning (FL) is a machine learning paradigm where multiple client devices train
models collaboratively without data exchange. Data heterogeneity problem is naturally …
models collaboratively without data exchange. Data heterogeneity problem is naturally …
GSearch: ultra-fast and scalable genome search by combining K-mer hashing with hierarchical navigable small world graphs
Genome search and/or classification typically involves finding the best-match database
(reference) genomes and has become increasingly challenging due to the growing number …
(reference) genomes and has become increasingly challenging due to the growing number …
Dessert: An efficient algorithm for vector set search with vector set queries
We study the problem of $\text {\emph {vector set search}} $ with $\text {\emph {vector set
queries}} $. This task is analogous to traditional near-neighbor search, with the exception …
queries}} $. This task is analogous to traditional near-neighbor search, with the exception …
Scalable and efficient non-adaptive deterministic group testing
Group Testing (GT) is about learning a (hidden) subset $ K $, of size $ k $, of some large
domain $ N $, of size $ n\gg k $, using a sequence of queries. A result of a query provides …
domain $ N $, of size $ n\gg k $, using a sequence of queries. A result of a query provides …
Combinatorial group testing with selfish agents
Abstract We study the Combinatorial Group Testing (CGT) problem in a novel game-
theoretic framework, with a solution concept of Adversarial Equilibrium (AE). In this new …
theoretic framework, with a solution concept of Adversarial Equilibrium (AE). In this new …
CAPS: A Practical Partition Index for Filtered Similarity Search
With the surging popularity of approximate near-neighbor search (ANNS), driven by
advances in neural representation learning, the ability to serve queries accompanied by a …
advances in neural representation learning, the ability to serve queries accompanied by a …
Fundamental Limitations on Subquadratic Alternatives to Transformers
The Transformer architecture is widely deployed in many popular and impactful Large
Language Models. At its core is the attention mechanism for calculating correlations …
Language Models. At its core is the attention mechanism for calculating correlations …
Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection
This work presents an adaptive group testing framework for the range-based high
dimensional near neighbor search problem. Our method efficiently marks each item in a …
dimensional near neighbor search problem. Our method efficiently marks each item in a …
A uniformed adaptive early termination model through probabilistic feature to speed up quantization-based search
J Jiang, S Xu, Y Gao - International Journal of Computers and …, 2024 - Taylor & Francis
The quantization-based approaches are the effective techniques for addressing the problem
of approximate nearest neighbor search. However, most of these methods typically employ a …
of approximate nearest neighbor search. However, most of these methods typically employ a …
Combinatorial Group Testing in Presence of Deletions
The study in group testing aims to develop strategies to identify a small set of defective items
among a large population using a few pooled tests. The established techniques have been …
among a large population using a few pooled tests. The established techniques have been …