Various properties of various ultrafilters, various graph width parameters, and various connectivity systems (with survey)
T Fujita - arxiv preprint arxiv:2408.02299, 2024 - arxiv.org
This paper investigates ultrafilters in the context of connectivity systems, defined as pairs
$(X, f) $ where $ X $ is a finite set and $ f $ is a symmetric submodular function. Ultrafilters …
$(X, f) $ where $ X $ is a finite set and $ f $ is a symmetric submodular function. Ultrafilters …
Adopting worst-case optimal joins in relational database systems
Worst-case optimal join algorithms are attractive from a theoretical point of view, as they offer
asymptotically better runtime than binary joins on certain types of queries. In particular, they …
asymptotically better runtime than binary joins on certain types of queries. In particular, they …
Joinboost: Grow trees over normalized data using only SQL
Although dominant for tabular data, ML libraries that train tree models over normalized
databases (eg, LightGBM, XGBoost) require the data to be denormalized as a single table …
databases (eg, LightGBM, XGBoost) require the data to be denormalized as a single table …
Tight fine-grained bounds for direct access on join queries
We consider the task of lexicographic direct access to query answers. That is, we want to
simulate an array containing the answers of a join query sorted in a lexicographic order …
simulate an array containing the answers of a join query sorted in a lexicographic order …
Saibot: A Differentially Private Data Search Platform
Recent data search platforms use ML task-based utility measures rather than metadata-
based keywords, to search large dataset corpora. Requesters submit a training dataset and …
based keywords, to search large dataset corpora. Requesters submit a training dataset and …
The fine-grained complexity of boolean conjunctive queries and sum-product problems
We study the fine-grained complexity of evaluating Boolean Conjunctive Queries and their
generalization to sum-of-product problems over an arbitrary semiring. For these problems …
generalization to sum-of-product problems over an arbitrary semiring. For these problems …
The relational data borg is learning
D Olteanu - arxiv preprint arxiv:2008.07864, 2020 - arxiv.org
This paper overviews an approach that addresses machine learning over relational data as
a database problem. This is justified by two observations. First, the input to the learning task …
a database problem. This is justified by two observations. First, the input to the learning task …
Lightweight Materialization for Fast Dashboards Over Joins
Dashboards are vital in modern business intelligence tools, providing non-technical users
with an interface to access comprehensive business data. With the rise of cloud technology …
with an interface to access comprehensive business data. With the rise of cloud technology …
Optimal join algorithms meet top-k
Top-k queries have been studied intensively in the database community and they are an
important means to reduce query cost when only the" best" or" most interesting" results are …
important means to reduce query cost when only the" best" or" most interesting" results are …
Kitana: Efficient Data Augmentation Search for AutoML
AutoML services provide a way for non-expert users to benefit from high-quality ML models
without worrying about model design and deployment, in exchange for a charge per hour …
without worrying about model design and deployment, in exchange for a charge per hour …