Large Language Models (LLMs) on Tabular Data: Prediction, Generation, and Understanding--A Survey

X Fang, W Xu, FA Tan, J Zhang, Z Hu, Y Qi… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent breakthroughs in large language modeling have facilitated rigorous exploration of
their application in diverse tasks related to tabular data modeling, such as prediction, tabular …

Large language model for table processing: A survey

W Lu, J Zhang, J Fan, Z Fu, Y Chen, X Du - Frontiers of Computer Science, 2025 - Springer
Tables, typically two-dimensional and structured to store large amounts of data, are
essential in daily activities like database queries, spreadsheet manipulations, Web table …

Debug like a human: A large language model debugger via verifying runtime execution step-by-step

L Zhong, Z Wang, J Shang - arxiv preprint arxiv:2402.16906, 2024 - arxiv.org
Large language models (LLMs) are leading significant progress in code generation. Beyond
one-pass code generation, recent works further integrate unit tests and program verifiers into …

Why tabular foundation models should be a research priority

B Van Breugel, M Van Der Schaar - arxiv preprint arxiv:2405.01147, 2024 - arxiv.org
Recent text and image foundation models are incredibly impressive, and these models are
attracting an ever-increasing portion of research resources. In this position piece we aim to …

Data-copilot: Bridging billions of data and humans with autonomous workflow

W Zhang, Y Shen, W Lu, Y Zhuang - arxiv preprint arxiv:2306.07209, 2023 - arxiv.org
Industries such as finance, meteorology, and energy generate vast amounts of data daily.
Efficiently managing, processing, and displaying this data requires specialized expertise …

Beyond chain-of-thought: A survey of chain-of-x paradigms for llms

Y **a, R Wang, X Liu, M Li, T Yu, X Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Chain-of-Thought (CoT) has been a widely adopted prompting method, eliciting impressive
reasoning abilities of Large Language Models (LLMs). Inspired by the sequential thought …

Large language models for tabular data: Progresses and future directions

H Dong, Z Wang - Proceedings of the 47th International ACM SIGIR …, 2024 - dl.acm.org
Tables contain a significant portion of the world's structured information. The ability to
efficiently and accurately understand, process, reason about, analyze, and generate tabular …

Found in the middle: Calibrating positional attention bias improves long context utilization

CY Hsieh, YS Chuang, CL Li, Z Wang, LT Le… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs), even when specifically trained to process long input
contexts, struggle to capture relevant information located in the middle of their input. This …

" It's like a rubber duck that talks back": Understanding Generative AI-Assisted Data Analysis Workflows through a Participatory Prompting Study

I Drosos, A Sarkar, X Xu, C Negreanu, S Rintel… - Proceedings of the 3rd …, 2024 - dl.acm.org
Generative AI tools can help users with many tasks. One such task is data analysis, which is
notoriously challenging for non-expert end-users due to its expertise requirements, and …

Large language models for data annotation and synthesis: A survey

Z Tan, D Li, S Wang, A Beigi, B Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org
Data annotation and synthesis generally refers to the labeling or generating of raw data with
relevant information, which could be used for improving the efficacy of machine learning …