Asymmetric LSH (ALSH) for sublinear time maximum inner product search (MIPS)

A Shrivastava, P Li - Advances in neural information …, 2014 - proceedings.neurips.cc
We present the first provably sublinear time hashing algorithm for approximate\emph
{Maximum Inner Product Search}(MIPS). Searching with (un-normalized) inner product as …

Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions

A Andoni, P Indyk - Communications of the ACM, 2008 - dl.acm.org
In this article, we give an overview of efficient algorithms for the approximate and exact
nearest neighbor problem. The goal is to preprocess a dataset of objects (eg, images) so …

Locality-sensitive hashing for finding nearest neighbors [lecture notes]

M Slaney, M Casey - IEEE Signal processing magazine, 2008 - ieeexplore.ieee.org
This lecture note describes a technique known as locality-sensitive hashing (LSH) that
allows one to quickly find similar entries in large databases. This approach belongs to a …

Scalable and sustainable deep learning via randomized hashing

R Spring, A Shrivastava - Proceedings of the 23rd ACM SIGKDD …, 2017 - dl.acm.org
Current deep learning architectures are growing larger in order to learn from complex
datasets. These architectures require giant matrix multiplication operations to train millions …

Detecting code clones in binary executables

A Sæbjørnsen, J Willcock, T Panas… - Proceedings of the …, 2009 - dl.acm.org
Large software projects contain significant code duplication, mainly due to copying and
pasting code. Many techniques have been developed to identify duplicated code to enable …

One permutation hashing

P Li, A Owen, CH Zhang - Advances in Neural Information …, 2012 - proceedings.neurips.cc
While minwise hashing is promising for large-scale learning in massive binary data, the
preprocessing cost is prohibitive as it requires applying (eg,) $ k= 500$ permutations on the …

Detection of recurring software vulnerabilities

NH Pham, TT Nguyen, HA Nguyen… - Proceedings of the 25th …, 2010 - dl.acm.org
Software security vulnerabilities are discovered on an almost daily basis and have caused
substantial damage. Aiming at supporting early detection and resolution for them, we have …

Clone management for evolving software

HA Nguyen, TT Nguyen, NH Pham… - IEEE transactions on …, 2011 - ieeexplore.ieee.org
Recent research results suggest a need for code clone management. In this paper, we
introduce JSync, a novel clone management tool. JSync provides two main functions to …

In defense of minhash over simhash

A Shrivastava, P Li - Artificial intelligence and statistics, 2014 - proceedings.mlr.press
MinHash and SimHash are the two widely adopted Locality Sensitive Hashing (LSH)
algorithms for large-scale data processing applications. Deciding which LSH to use for a …

Improved asymmetric locality sensitive hashing (ALSH) for maximum inner product search (MIPS)

A Shrivastava, P Li - arxiv preprint arxiv:1410.5410, 2014 - arxiv.org
Recently it was shown that the problem of Maximum Inner Product Search (MIPS) is efficient
and it admits provably sub-linear hashing algorithms. Asymmetric transformations before …