One-pass distribution sketch for measuring data heterogeneity in federated learning

Z Liu, Z Xu, B Coleman… - Advances in Neural …, 2023 - proceedings.neurips.cc
Federated learning (FL) is a machine learning paradigm where multiple client devices train
models collaboratively without data exchange. Data heterogeneity problem is naturally …

GSearch: ultra-fast and scalable genome search by combining K-mer hashing with hierarchical navigable small world graphs

J Zhao, JP Both, LM Rodriguez-R… - Nucleic Acids …, 2024 - academic.oup.com
Genome search and/or classification typically involves finding the best-match database
(reference) genomes and has become increasingly challenging due to the growing number …

Dessert: An efficient algorithm for vector set search with vector set queries

J Engels, B Coleman, V Lakshman… - Advances in Neural …, 2023 - proceedings.neurips.cc
We study the problem of $\text {\emph {vector set search}} $ with $\text {\emph {vector set
queries}} $. This task is analogous to traditional near-neighbor search, with the exception …

Scalable and efficient non-adaptive deterministic group testing

D Kowalski, D Pajak - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Group Testing (GT) is about learning a (hidden) subset $ K $, of size $ k $, of some large
domain $ N $, of size $ n\gg k $, using a sequence of queries. A result of a query provides …

Combinatorial group testing with selfish agents

G Chionas, D Kowalski… - Advances in Neural …, 2023 - proceedings.neurips.cc
Abstract We study the Combinatorial Group Testing (CGT) problem in a novel game-
theoretic framework, with a solution concept of Adversarial Equilibrium (AE). In this new …

CAPS: A Practical Partition Index for Filtered Similarity Search

G Gupta, J Yi, B Coleman, C Luo, V Lakshman… - arxiv preprint arxiv …, 2023 - arxiv.org
With the surging popularity of approximate near-neighbor search (ANNS), driven by
advances in neural representation learning, the ability to serve queries accompanied by a …

Fundamental Limitations on Subquadratic Alternatives to Transformers

J Alman, H Yu - arxiv preprint arxiv:2410.04271, 2024 - arxiv.org
The Transformer architecture is widely deployed in many popular and impactful Large
Language Models. At its core is the attention mechanism for calculating correlations …

Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection

H Shah, K Mittal, A Rajwade - European Conference on Computer Vision, 2024 - Springer
This work presents an adaptive group testing framework for the range-based high
dimensional near neighbor search problem. Our method efficiently marks each item in a …

A uniformed adaptive early termination model through probabilistic feature to speed up quantization-based search

J Jiang, S Xu, Y Gao - International Journal of Computers and …, 2024 - Taylor & Francis
The quantization-based approaches are the effective techniques for addressing the problem
of approximate nearest neighbor search. However, most of these methods typically employ a …

Combinatorial Group Testing in Presence of Deletions

V Gandikota, N Polyanskii, H Yang - arxiv preprint arxiv:2310.09613, 2023 - arxiv.org
The study in group testing aims to develop strategies to identify a small set of defective items
among a large population using a few pooled tests. The established techniques have been …