A deep dive into common open formats for analytical dbmss
This paper evaluates the suitability of Apache Arrow, Parquet, and ORC as formats for
subsumption in an analytical DBMS. We systematically identify and explore the high-level …
subsumption in an analytical DBMS. We systematically identify and explore the high-level …
Self-tuning Database Systems: A Systematic Literature Review of Automatic Database Schema Design and Tuning
Self-tuning is a feature of autonomic databases that includes the problem of automatic
schema design. It aims at providing an optimized schema that increases the overall …
schema design. It aims at providing an optimized schema that increases the overall …
Proteus: Autonomous adaptive storage for mixed workloads
Enterprises use distributed database systems to meet the demands of mixed or hybrid
transaction/analytical processing (HTAP) workloads that contain both transactional (OLTP) …
transaction/analytical processing (HTAP) workloads that contain both transactional (OLTP) …
SAT: sampling acceleration tree for adaptive database repartition
X ** time series for efficient columnar storage
Columnar storage is now an industry standard design in most open-source or commercial
time series database products, making them HTAP systems. The time column of a time …
time series database products, making them HTAP systems. The time column of a time …
Partition, Don't Sort! Compression Boosters for Cloud Data Ingestion Pipelines
P Hansert, S Michel - Proceedings of the VLDB Endowment, 2024 - dl.acm.org
Data Lakes deployed in the cloud are a go-to solution for enterprise data storage. While the
pay-as-you-go cost model allows flexible resource allocation and billing, it mandates an …
pay-as-you-go cost model allows flexible resource allocation and billing, it mandates an …
DrTM+ B: Replication-driven live reconfiguration for fast and general distributed transaction processing
Recent in-memory database systems leverage advanced hardware features like RDMA to
provide transaction processing at millions of transactions per second. Distributed transaction …
provide transaction processing at millions of transactions per second. Distributed transaction …
DataLab: A Unifed Platform for LLM-Powered Business Intelligence
Business intelligence (BI) transforms large volumes of data within modern organizations into
actionable insights for informed decision-making. Recently, large language model (LLM) …
actionable insights for informed decision-making. Recently, large language model (LLM) …
SH2O: Efficient Data Access for Work-Sharing Databases
Interactive applications require processing tens to hundreds of concurrent analytical queries
within tight time constraints. In such setups, where high concurrency causes contention …
within tight time constraints. In such setups, where high concurrency causes contention …