A deep dive into common open formats for analytical dbmss

C Liu, A Pavlenko, M Interlandi, B Haynes - Proceedings of the VLDB …, 2023 - dl.acm.org
This paper evaluates the suitability of Apache Arrow, Parquet, and ORC as formats for
subsumption in an analytical DBMS. We systematically identify and explore the high-level …

Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and Tuning

M Mozaffari, A Dignös, J Gamper, U Störl - ACM Computing Surveys, 2024 - dl.acm.org
Self-tuning is a feature of autonomic databases that includes the problem of automatic
schema design. It aims at providing an optimized schema that increases the overall …

Proteus: Autonomous adaptive storage for mixed workloads

M Abebe, H Lazu, K Daudjee - … of the 2022 International Conference on …, 2022 - dl.acm.org
Enterprises use distributed database systems to meet the demands of mixed or hybrid
transaction/analytical processing (HTAP) workloads that contain both transactional (OLTP) …

SAT: sampling acceleration tree for adaptive database repartition

X ** time series for efficient columnar storage
C Fang, S Song, H Guan, X Huang, C Wang… - Proceedings of the ACM …, 2023 - dl.acm.org
Columnar storage is now an industry standard design in most open-source or commercial
time series database products, making them HTAP systems. The time column of a time …

Partition, Don't Sort! Compression Boosters for Cloud Data Ingestion Pipelines

P Hansert, S Michel - Proceedings of the VLDB Endowment, 2024 - dl.acm.org
Data Lakes deployed in the cloud are a go-to solution for enterprise data storage. While the
pay-as-you-go cost model allows flexible resource allocation and billing, it mandates an …

DrTM+ B: Replication-driven live reconfiguration for fast and general distributed transaction processing

S Shen, X Wei, R Chen, H Chen… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Recent in-memory database systems leverage advanced hardware features like RDMA to
provide transaction processing at millions of transactions per second. Distributed transaction …

DataLab: A Unifed Platform for LLM-Powered Business Intelligence

L Weng, Y Tang, Y Feng, Z Chang, P Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Business intelligence (BI) transforms large volumes of data within modern organizations into
actionable insights for informed decision-making. Recently, large language model (LLM) …

SH2O: Efficient Data Access for Work-Sharing Databases

P Sioulas, I Mytilinis, A Ailamaki - … of the ACM on Management of Data, 2023 - dl.acm.org
Interactive applications require processing tens to hundreds of concurrent analytical queries
within tight time constraints. In such setups, where high concurrency causes contention …