Knowledge graphs

A Hogan, E Blomqvist, M Cochez, C d'Amato… - ACM Computing …, 2021 - dl.acm.org
In this article, we provide a comprehensive introduction to knowledge graphs, which have
recently garnered significant attention from both industry and academia in scenarios that …

A comprehensive survey on automatic knowledge graph construction

L Zhong, J Wu, Q Li, H Peng, X Wu - ACM Computing Surveys, 2023 - dl.acm.org
Automatic knowledge graph construction aims at manufacturing structured human
knowledge. To this end, much effort has historically been spent extracting informative fact …

Data lake management: challenges and opportunities

F Nargesian, E Zhu, RJ Miller, KQ Pu… - Proceedings of the VLDB …, 2019 - dl.acm.org
The ubiquity of data lakes has created fascinating new challenges for data management
research. In this tutorial, we review the state-of-the-art in data management for data lakes …

Tuta: Tree-based transformers for generally structured table pre-training

Z Wang, H Dong, R Jia, J Li, Z Fu, S Han… - Proceedings of the 27th …, 2021 - dl.acm.org
We propose TUTA, a unified pre-training architecture for understanding generally structured
tables. Noticing that understanding a table requires spatial, hierarchical, and semantic …

From tabular data to knowledge graphs: A survey of semantic table interpretation tasks and methods

J Liu, Y Chabot, R Troncy, VP Huynh, T Labbé… - Journal of Web …, 2023 - Elsevier
Tabular data often refers to data that is organized in a table with rows and columns. We
observe that this data format is widely used on the Web and within enterprise data …

Web table extraction, retrieval, and augmentation: A survey

S Zhang, K Balog - ACM Transactions on Intelligent Systems and …, 2020 - dl.acm.org
Tables are powerful and popular tools for organizing and manipulating data. A vast number
of tables can be found on the Web, which represent a valuable knowledge resource. The …

Dataset discovery and exploration: A survey

NW Paton, J Chen, Z Wu - ACM Computing Surveys, 2023 - dl.acm.org
Data scientists are tasked with obtaining insights from data. However, suitable data is often
not immediately at hand, and there may be many potentially relevant datasets in a data lake …

Gittables: A large-scale corpus of relational tables

M Hulsebos, Ç Demiralp, P Groth - … of the ACM on Management of Data, 2023 - dl.acm.org
The success of deep learning has sparked interest in improving relational table tasks, like
data preparation and search, with table representation models trained on large table …

Ten years of webtables

M Cafarella, A Halevy, H Lee, J Madhavan… - Proceedings of the …, 2018 - dl.acm.org
In 2008, we wrote about WebTables, an effort to exploit the large and diverse set of
structured databases casually published online in the form of HTML tables. The past decade …

Entrant: A large financial dataset for table understanding

E Zavitsanos, D Mavroeidis, E Spyropoulou… - Scientific Data, 2024 - nature.com
Tabular data is a way to structure, organize, and present information conveniently and
effectively. Real-world tables present data in two dimensions by arranging cells in matrices …