Web table extraction, retrieval, and augmentation: A survey

S Zhang, K Balog - ACM Transactions on Intelligent Systems and …, 2020 - dl.acm.org
Tables are powerful and popular tools for organizing and manipulating data. A vast number
of tables can be found on the Web, which represent a valuable knowledge resource. The …

Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arxiv preprint arxiv …, 2022 - arxiv.org
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

Turl: Table understanding through representation learning

X Deng, H Sun, A Lees, Y Wu, C Yu - ACM SIGMOD Record, 2022 - dl.acm.org
Relational tables on the Web store a vast amount of knowledge. Owing to the wealth of such
tables, there has been tremendous progress on a variety of tasks in the area of table …

Tabbie: Pretrained representations of tabular data

H Iida, D Thai, V Manjunatha, M Iyyer - arxiv preprint arxiv:2105.02584, 2021 - arxiv.org
Existing work on tabular representation learning jointly models tables and associated text
using self-supervised objective functions derived from pretrained language models such as …

Tuta: Tree-based transformers for generally structured table pre-training

Z Wang, H Dong, R Jia, J Li, Z Fu, S Han… - Proceedings of the 27th …, 2021 - dl.acm.org
We propose TUTA, a unified pre-training architecture for understanding generally structured
tables. Noticing that understanding a table requires spatial, hierarchical, and semantic …

Table structure recognition using top-down and bottom-up cues

S Raja, A Mondal, CV Jawahar - … Conference, Glasgow, UK, August 23–28 …, 2020 - Springer
Tables are information-rich structured objects in document images. While significant work
has been done in localizing tables as graphic objects in document images, only limited …

Grappa: Grammar-augmented pre-training for table semantic parsing

T Yu, CS Wu, XV Lin, B Wang, YC Tan, X Yang… - arxiv preprint arxiv …, 2020 - arxiv.org
We present GraPPa, an effective pre-training approach for table semantic parsing that learns
a compositional inductive bias in the joint representations of textual and tabular data. We …

TabularNet: A neural network architecture for understanding semantic structures of tabular data

L Du, F Gao, X Chen, R Jia, J Wang, J Zhang… - Proceedings of the 27th …, 2021 - dl.acm.org
Tabular data are ubiquitous for the widespread applications of tables and hence have
attracted the attention of researchers to extract underlying information. One of the critical …

Large language models for tabular data: Progresses and future directions

H Dong, Z Wang - Proceedings of the 47th International ACM SIGIR …, 2024 - dl.acm.org
Tables contain a significant portion of the world's structured information. The ability to
efficiently and accurately understand, process, reason about, analyze, and generate tabular …

A hierarchical model for data-to-text generation

C Rebuffel, L Soulier, G Scoutheeten… - Advances in Information …, 2020 - Springer
Transcribing structured data into natural language descriptions has emerged as a
challenging task, referred to as “data-to-text”. These structures generally regroup multiple …