Lmsys-chat-1m: A large-scale real-world llm conversation dataset
Studying how people interact with large language models (LLMs) in real-world scenarios is
increasingly important due to their widespread use in various applications. In this paper, we …
increasingly important due to their widespread use in various applications. In this paper, we …
Towards efficient generative large language model serving: A survey from algorithms to systems
In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …
Automix: Automatically mixing language models
Large language models (LLMs) are now available from cloud API providers in various sizes
and configurations. While this diversity offers a broad spectrum of choices, effectively …
and configurations. While this diversity offers a broad spectrum of choices, effectively …
Enhancing on-device llm inference with historical cloud-based llm interactions
Many billion-scale large language models (LLMs) have been released for resource-
constraint mobile devices to provide local LLM inference service when cloud-based …
constraint mobile devices to provide local LLM inference service when cloud-based …
Graphrouter: A graph-based router for llm selections
The rapidly growing number and variety of Large Language Models (LLMs) present
significant challenges in efficiently selecting the appropriate LLM for a given query …
significant challenges in efficiently selecting the appropriate LLM for a given query …
Model compression and efficient inference for large language models: A survey
Transformer based large language models have achieved tremendous success. However,
the significant memory and computational costs incurred during the inference process make …
the significant memory and computational costs incurred during the inference process make …
Cache & distil: Optimising API calls to large language models
Large-scale deployment of generative AI tools often depends on costly API calls to a Large
Language Model (LLM) to fulfil user queries. To curtail the frequency of these calls, one can …
Language Model (LLM) to fulfil user queries. To curtail the frequency of these calls, one can …
Teola: Towards end-to-end optimization of llm-based applications
Large language model (LLM)-based applications consist of both LLM and non-LLM
components, each contributing to the end-to-end latency. Despite great efforts to optimize …
components, each contributing to the end-to-end latency. Despite great efforts to optimize …
A Survey on Effective Invocation Methods of Massive LLM Services
Language models as a service (LMaaS) enable users to accomplish tasks without requiring
specialized knowledge, simply by paying a service provider. However, numerous providers …
specialized knowledge, simply by paying a service provider. However, numerous providers …
[HTML][HTML] Leveraging LLMs for optimised feature selection and embedding in structured data: A case study on graduate employment classification
Abstract The application of Machine Learning (ML) for predicting graduate student
employability is a growing area of research, driven by the need to align educational …
employability is a growing area of research, driven by the need to align educational …