Lotus: Enabling semantic queries with llms over tables of unstructured and structured data
The semantic capabilities of language models (LMs) have the potential to enable rich
analytics and reasoning over vast knowledge corpora. Unfortunately, existing systems lack …
analytics and reasoning over vast knowledge corpora. Unfortunately, existing systems lack …
DocETL: Agentic Query Rewriting and Evaluation for Complex Document Processing
S Shankar, T Chambers, T Shah… - arxiv preprint arxiv …, 2024 - arxiv.org
Analyzing unstructured data has been a persistent challenge in data processing. Large
Language Models (LLMs) have shown promise in this regard, leading to recent proposals …
Language Models (LLMs) have shown promise in this regard, leading to recent proposals …
The Design of an LLM-powered Unstructured Analytics System
E Anderson, J Fritz, A Lee, B Li, M Lindblad… - arxiv preprint arxiv …, 2024 - arxiv.org
LLMs demonstrate an uncanny ability to process unstructured data, and as such, have the
potential to go beyond search and run complex, semantic analyses at scale. We describe …
potential to go beyond search and run complex, semantic analyses at scale. We describe …
Variable Extraction for Model Recovery in Scientific Literature
The global output of academic publications exceeds 5 million articles per year, making it
difficult for humans to keep up with even a tiny fraction of scientific output. We need methods …
difficult for humans to keep up with even a tiny fraction of scientific output. We need methods …
A Declarative System for Optimizing AI Workloads
Modern AI models provide the key to a long-standing dream: processing analytical queries
about almost any kind of data. Until recently, it was difficult and expensive to extract facts …
about almost any kind of data. Until recently, it was difficult and expensive to extract facts …
[PDF][PDF] AOP: Automated and Interactive LLM Pipeline Orchestration for Answering Complex Queries
Current data lakes are limited to basic put/get functions on unstructured data and analytical
queries on structured data. They fall short in handling complex queries that require multi-hop …
queries on structured data. They fall short in handling complex queries that require multi-hop …
[PDF][PDF] Palimpzest: Optimizing AI-Powered Analytics with Declarative Query Processing
ABSTRACT A long-standing goal of data management systems has been to build systems
which can compute quantitative insights over large collections of unstructured data in a cost …
which can compute quantitative insights over large collections of unstructured data in a cost …