Survey and taxonomy of lossless graph compression and space-efficient graph representations

M Besta, T Hoefler - arxiv preprint arxiv:1806.01799, 2018 - arxiv.org
Various graphs such as web or social networks may contain up to trillions of edges.
Compressing such datasets can accelerate graph processing by reducing the amount of I/O …

Graphscope: parameter-free mining of large time-evolving graphs

J Sun, C Faloutsos, S Papadimitriou… - Proceedings of the 13th …, 2007 - dl.acm.org
How can we find communities in dynamic networks of socialinteractions, such as who calls
whom, who emails whom, or who sells to whom? How can we spot discontinuity time-points …

The concentration of fractional distances

D François, V Wertz, M Verleysen - IEEE Transactions on …, 2007 - ieeexplore.ieee.org
Nearest neighbor search and many other numerical data analysis tools most often rely on
the use of the euclidean distance. When data are high dimensional, however, the euclidean …

TSP—infrastructure for the traveling salesperson problem

M Hahsler, K Hornik - Journal of Statistical Software, 2008 - jstatsoft.org
The traveling salesperson (or, salesman) problem (TSP) is a well known and important
combinatorial optimization problem. The goal is to find the shortest tour that visits each city in …

Pics: Parameter-free identification of cohesive subgroups in large attributed graphs

L Akoglu, H Tong, B Meeder, C Faloutsos - Proceedings of the 2012 SIAM …, 2012 - SIAM
Given a graph with node attributes, how can we find meaningful patterns such as clusters,
bridges, and outliers? Attributed graphs appear in real world in the form of social networks …

Sorting improves word-aligned bitmap indexes

D Lemire, O Kaser, K Aouiche - Data & Knowledge Engineering, 2010 - Elsevier
Bitmap indexes must be compressed to reduce input/output costs and minimize CPU usage.
To accelerate logical operations (AND, OR, XOR) over bitmaps, we use techniques based …

Processing a trillion cells per mouse click

A Hall, O Bachmann, R Büssow, S Gănceanu… - arxiv preprint arxiv …, 2012 - arxiv.org
Column-oriented database systems have been a real game changer for the industry in
recent years. Highly tuned and performant systems have evolved that provide users with the …

ISABELA-QA: Query-driven analytics with ISABELA-compressed extreme-scale scientific data

S Lakshminarasimhan, J Jenkins, I Arkatkar… - Proceedings of 2011 …, 2011 - dl.acm.org
Efficient analytics of scientific data from extreme-scale simulations is quickly becoming a top-
notch priority. The increasing simulation output data sizes demand for a paradigm shift in …

Summarizing transactional databases with overlapped hyperrectangles

Y **ang, R **, D Fuhry, FF Dragan - Data Mining and Knowledge …, 2011 - Springer
Transactional data are ubiquitous. Several methods, including frequent itemset mining and
co-clustering, have been proposed to analyze transactional databases. In this work, we …

[PDF][PDF] Rearrangement Clustering: Pitfalls, Remedies, and Applications.

S Climer, W Zhang, T Joachims - Journal of Machine Learning Research, 2006 - jmlr.org
Given a matrix of values in which the rows correspond to objects and the columns
correspond to features of the objects, rearrangement clustering is the problem of rearranging …