Lotus: Enabling semantic queries with llms over tables of unstructured and structured data

L Patel, S Jha, C Guestrin, M Zaharia - arxiv preprint arxiv:2407.11418, 2024 - arxiv.org
The semantic capabilities of language models (LMs) have the potential to enable rich
analytics and reasoning over vast knowledge corpora. Unfortunately, existing systems lack …

DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing

S Shankar, T Chambers, T Shah… - arxiv preprint arxiv …, 2024 - arxiv.org
Analyzing unstructured data has been a persistent challenge in data processing. Large
Language Models (LLMs) have shown promise in this regard, leading to recent proposals …

The Design of an LLM-powered Unstructured Analytics System

E Anderson, J Fritz, A Lee, B Li, M Lindblad… - arxiv preprint arxiv …, 2024 - arxiv.org
LLMs demonstrate an uncanny ability to process unstructured data, and as such, have the
potential to go beyond search and run complex, semantic analyses at scale. We describe …

Variable Extraction for Model Recovery in Scientific Literature

C Liu, E Noriega-Atala, A Pyarelal, CT Morrison… - arxiv preprint arxiv …, 2024 - arxiv.org
The global output of academic publications exceeds 5 million articles per year, making it
difficult for humans to keep up with even a tiny fraction of scientific output. We need methods …

A Declarative System for Optimizing AI Workloads

C Liu, M Russo, M Cafarella, L Cao, PB Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern AI models provide the key to a long-standing dream: processing analytical queries
about almost any kind of data. Until recently, it was difficult and expensive to extract facts …

[PDF][PDF] AOP: Automated and Interactive LLM Pipeline Orchestration for Answering Complex Queries

J Wang, G Li - vldb.org
Current data lakes are limited to basic put/get functions on unstructured data and analytical
queries on structured data. They fall short in handling complex queries that require multi-hop …

[PDF][PDF] Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing

C Liu, M Russo, M Cafarella, L Cao, PB Chen, Z Chen… - vldb.org
ABSTRACT A long-standing goal of data management systems has been to build systems
which can compute quantitative insights over large collections of unstructured data in a cost …