- Academic Search

S Galhotra, Y Gong… - 2023 IEEE 39th …, 2023 - ieeexplore.ieee.org

Data is a central component of machine learning and causal inference tasks. The availability
of large amounts of data from sources such as open data repositories, data lakes and data …

Speichern Zitieren Zitiert von: 38 Ähnliche Artikel Alle 7 Versionen

[Free GPT-4]

[PDF] arxiv.org

Observatory: Characterizing embeddings of relational tables

T Cong, M Hulsebos, Z Sun, P Groth… - arxiv preprint arxiv …, 2023 - arxiv.org

Language models and specialized table embedding models have recently demonstrated
strong performance on many tasks over tabular data. Researchers and practitioners are …

Speichern Zitieren Zitiert von: 11 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Retrieve, merge, predict: Augmenting tables with data lakes

R Cappuzzo, A Coelho, F Lefebvre, P Papotti… - arxiv preprint arxiv …, 2024 - arxiv.org

Machine-learning from a disparate set of tables, a data lake, requires assembling features
by merging and aggregating tables. Data discovery can extend autoML to data tables by …

Speichern Zitieren Zitiert von: 5 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Warpgate: A semantic join discovery system for cloud data warehouses

T Cong, J Gale, J Frantz, HV Jagadish… - arxiv preprint arxiv …, 2022 - arxiv.org

Data discovery is a major challenge in enterprise data analysis: users often struggle to find
data relevant to their analysis goals or even to navigate through data across data sources …

Speichern Zitieren Zitiert von: 19 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] mlsys.org

UniDM: A Unified Framework for Data Manipulation with Large Language Models

Y Qian, Y He, R Zhu, J Huang, Z Ma… - Proceedings of …, 2024 - proceedings.mlsys.org

Designing effective data manipulation methods is a long standing problem in data lakes.
Traditional methods, which rely on rules or machine learning models, require extensive …

Speichern Zitieren Zitiert von: 4 Ähnliche Artikel Alle 4 Versionen HTML-Version

Towards an architecture to support data access in research data spaces

J Möller, D Jankowski, A Hahn - 2021 IEEE 22nd International …, 2021 - ieeexplore.ieee.org

Using data from different data sources is a common procedure in data-driven research. As
required data is often not available from centrally managed sources, the concept of data …

Speichern Zitieren Zitiert von: 6 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] ieee.org

Suggesting assess queries for interactive analysis of multidimensional data

M Francia, M Golfarelli, P Marcel, S Rizzi… - … on Knowledge and …, 2022 - ieeexplore.ieee.org

Assessment is the process of comparing the actual to the expected behavior of a business
phenomenon and judging the outcome of the comparison. The querying operator has been …

Speichern Zitieren Zitiert von: 4 Ähnliche Artikel Alle 7 Versionen

[Free GPT-4]

[PDF] arxiv.org

FREYJA: Efficient Join Discovery in Data Lakes

M Maynou, S Nadal, R Panadero, J Flores… - arxiv preprint arxiv …, 2024 - arxiv.org

Data lakes are massive repositories of raw and heterogeneous data, designed to meet the
requirements of modern data storage. Nonetheless, this same philosophy increases the …

Speichern Zitieren Ähnliche Artikel HTML-Version

[Free GPT-4]

[PDF] acm.org

It Took Longer than I was Expecting: Why is Dataset Search Still so Hard?

M Hulsebos, W Lin, S Shankar… - Proceedings of the 2024 …, 2024 - dl.acm.org

Dataset search is a long-standing problem across both industry and academia. While most
industry tools focus on identifying one or more datasets matching a user-specified query …

Speichern Zitieren Zitiert von: 3 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] uva.nl

[BUCH][B] Table Representation Learning

M Hulsebos - 2024 - pure.uva.nl

The increasing amount of data being collected, stored, and analyzed, induces a need for
efficient, scalable, and robust methods to handle this data. Representation learning, ie, the …

Speichern Zitieren Ähnliche Artikel Alle 2 Versionen Bibliothekssuche HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Towards scalable data discovery

Metam: Goal-oriented data discovery

Observatory: Characterizing embeddings of relational tables

Retrieve, merge, predict: Augmenting tables with data lakes

Warpgate: A semantic join discovery system for cloud data warehouses

UniDM: A Unified Framework for Data Manipulation with Large Language Models

Towards an architecture to support data access in research data spaces

Suggesting assess queries for interactive analysis of multidimensional data

FREYJA: Efficient Join Discovery in Data Lakes

It Took Longer than I was Expecting: Why is Dataset Search Still so Hard?

[BUCH][B] Table Representation Learning