Milvus: A purpose-built vector data management system

J Wang, X Yi, R Guo, H **, P Xu, S Li, X Wang… - Proceedings of the …, 2021 - dl.acm.org
Recently, there has been a pressing need to manage high-dimensional vector data in data
science and AI applications. This trend is fueled by the proliferation of unstructured data and …

The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds

P Ferragina, G Vinciguerra - Proceedings of the VLDB Endowment, 2020 - dl.acm.org
We present the first learned index that supports predecessor, range queries and updates
within provably efficient time and space bounds in the worst case. In the (static) context of …

Morton filters: faster, space-efficient cuckoo filters via biasing, compression, and decoupled logical sparsity

AD Breslow, NS Jayasena - Proceedings of the VLDB Endowment, 2018 - dl.acm.org
Approximate set membership data structures (ASMDSs) are ubiquitous in computing. They
trade a tunable, often small, error rate (ϵ) for large space savings. The canonical ASMDS is …

BtrBlocks: efficient columnar compression for data lakes

M Kuschewski, D Sauerwein, A Alhomssi… - Proceedings of the ACM …, 2023 - dl.acm.org
Analytics is moving to the cloud and data is moving into data lakes. These reside on object
storage services like S3 and enable seamless data sharing and system interoperability. To …

Daga: Detecting attacks to in-vehicle networks via n-gram analysis

D Stabili, L Ferretti, M Andreolini… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Recent research showcased several cyber-attacks against unmodified licensed vehicles,
demonstrating the vulnerability of their internal networks. Many solutions have already been …

Roaring bitmaps: Implementation of an optimized software library

D Lemire, O Kaser, N Kurz, L Deri… - Software: Practice …, 2018 - Wiley Online Library
Compressed bitmap indexes are used in systems such as Git or Oracle to accelerate
queries. They represent sets and often support operations such as unions, intersections …

Instance-optimized data layouts for cloud analytics workloads

J Ding, UF Minhas, B Chandramouli, C Wang… - Proceedings of the …, 2021 - dl.acm.org
Today, businesses rely on efficiently running analytics on large amounts of operational and
historical data to gain business insights and competitive advantage. Increasingly, such …

Tile-based lightweight integer compression in GPU

A Shanbhag, BW Yogatama, X Yu… - Proceedings of the 2022 …, 2022 - dl.acm.org
GPUs are increasingly used for high-performance and interactive data analytics workloads
due to their capability to accelerate computation using massive parallelism. A key constraint …

Speeding up set intersections in graph algorithms using simd instructions

S Han, L Zou, JX Yu - Proceedings of the 2018 International Conference …, 2018 - dl.acm.org
In this paper, we focus on accelerating a widely employed computing pattern---set
intersection, to boost a group of graph algorithms. Graph's adjacency-lists can be naturally …

[PDF][PDF] Identifying insufficient data coverage in databases with multiple relations

Y Lin, Y Guan, A Asudeh, HV Jagadish - Proceedings of the VLDB …, 2020 - par.nsf.gov
In today's data-driven world, it is critical that we use appropriate datasets for analysis and
decision-making. Datasets could be biased because they reflect existing inequalities in the …