Photon: A fast query engine for lakehouse systems

A Behm, S Palkar, U Agarwal, T Armstrong… - Proceedings of the …, 2022 - dl.acm.org
Many organizations are shifting to a data management paradigm called the" Lakehouse,"
which implements the functionality of structured data warehouses on top of unstructured …

Query processing on tensor computation runtimes

D He, S Nakandala, D Banda, R Sen, K Saur… - arxiv preprint arxiv …, 2022 - arxiv.org
The huge demand for computation in artificial intelligence (AI) is driving unparalleled
investments in hardware and software systems for AI. This leads to an explosion in the …

Incremental Fusion: Unifying Compiled and Vectorized Query Execution

B Wagner, A Kohn, P Boncz… - 2024 IEEE 40th …, 2024 - ieeexplore.ieee.org
Modern high-performance analytical query engines follow one of two execution paradigms.
Vectorized engines implement an interpreter for relational algebra operators that operates …

BⓈX: Subgraph Matching with Batch Backtracking Search

Y Lu, Z Zhang, W Zheng - Proceedings of the ACM on Management of …, 2025 - dl.acm.org
Subgraph matching is a fundamental problem in graph analysis. Recently, many algorithms
have been developed, often using classic backtracking search. This traditional backtracking …

Data Chunk Compaction in Vectorized Execution

Y Qiao, H Zhang - Proceedings of the ACM on Management of Data, 2025 - dl.acm.org
Modern analytical database management systems often adopt vectorized query execution
engines that process columnar data in batches (ie, data chunks) to minimize the …

NULLS!: Revisiting Null Representation in Modern Columnar Formats

X Zeng, R Meng, A Pavlo, W McKinney… - Proceedings of the 20th …, 2024 - dl.acm.org
Nulls are common in real-world data sets, yet recent research on columnar formats and
encodings rarely address Null representations. Popular file formats like Parquet and ORC …

Data Systems for Explainable AI and Incorporating AI Infrastructure into Data Systems

D He - 2024 - search.proquest.com
Artificial Intelligence (AI) has become a cornerstone of modern computing, powering a wide
range of applications in fields from face recognition and machine translation to medical …

Balancing memory consumption and performance of in-memory database systems

M Boissier - 2024 - publishup.uni-potsdam.de
In-memory database systems keep their data primarily resident in main memory to avoid
performance penalties from accessing persistent storage mediums such as solid-state …

[PDF][PDF] Predicate Pushdown in FastLanes

R Duņamalijevs - 2024 - eprints.illc.uva.nl
This project explores predicate evaluation for the FastLanes file format within the framework
of cascaded encoding, which encodes the data in multiple layers to achieve higher …

[PDF][PDF] Photon: A Fast Query Engine for Lakehouse Systems

TB Samwel, H van Hovell, M Xue, R **n, M Zaharia - 2022 - liuyehcf.github.io
Many organizations are shifting to a data management paradigm called the “Lakehouse,”
which implements the functionality of structured data warehouses on top of unstructured …