Who validates the validators? aligning llm-assisted evaluation of llm outputs with human preferences

S Shankar, JD Zamfirescu-Pereira… - Proceedings of the 37th …, 2024 - dl.acm.org
Due to the cumbersome nature of human evaluation and limitations of code-based
evaluation, Large Language Models (LLMs) are increasingly being used to assist humans in …

Automated design of agentic systems

S Hu, C Lu, J Clune - ar** powerful general-purpose agents,
wherein Foundation Models are used as modules within agentic systems (eg Chain-of …

Recommendation with generative models

Y Deldjoo, Z He, J McAuley, A Korikov… - arxiv preprint arxiv …, 2024 - arxiv.org
Generative models are a class of AI models capable of creating new instances of data by
learning and sampling from their statistical distributions. In recent years, these models have …

The path forward for large language models in medicine is open

L Riedemann, M Labonne, S Gilbert - npj Digital Medicine, 2024 - nature.com
Large language models (LLMs) are increasingly applied in medical documentation and
have been proposed for clinical decision support. We argue that the future for LLMs in …

On the limitations of compute thresholds as a governance strategy

S Hooker - arxiv preprint arxiv:2407.05694, 2024 - arxiv.org
At face value, this essay is about understanding a fairly esoteric governance tool called
compute thresholds. However, in order to grapple with whether these thresholds will achieve …

NeurDB: an AI-powered autonomous data system

BC Ooi, S Cai, G Chen, Y Shen, KL Tan, Y Wu… - Science China …, 2024 - Springer
In the wake of rapid advancements in artificial intelligence (AI), we stand on the brink of a
transformative leap in data systems. The imminent fusion of AI and DB (AI× DB) promises a …

Databases Unbound: Querying All of the World's Bytes with AI

S Madden, M Cafarella, M Franklin… - Proceedings of the VLDB …, 2024 - dl.acm.org
Over the past five decades, the relational database model has proven to be a scaleable and
adaptable model for querying a variety of structured data, with use cases in analytics …

A perspective for adapting generalist ai to specialized medical ai applications and their challenges

Z Wang, H Wang, B Danek, Y Li, C Mack… - arxiv preprint arxiv …, 2024 - arxiv.org
The integration of Large Language Models (LLMs) into medical applications has sparked
widespread interest across the healthcare industry, from drug discovery and development to …

Queue management for slo-oriented large language model serving

A Patke, D Reddy, S Jha, H Qiu, C Pinto… - Proceedings of the …, 2024 - dl.acm.org
Large language model (LLM) serving is becoming an increasingly critical workload for cloud
providers. Existing LLM serving systems focus on interactive requests, such as chatbots and …

Structuredrag: Json response formatting with large language models

C Shorten, C Pierse, TB Smith, E Cardenas… - arxiv preprint arxiv …, 2024 - arxiv.org
The ability of Large Language Models (LLMs) to generate structured outputs, such as JSON,
is crucial for their use in Compound AI Systems. However, evaluating and improving this …