Challenges and applications of large language models
Efficient methods for natural language processing: A survey
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …
scaling model parameters and training data; however, using only scale to improve …
{Cost-Efficient} large language model serving for multi-turn conversations with {CachedAttention}
Interacting with humans through multi-turn conversations is a fundamental feature of large
language models (LLMs). However, existing LLM serving engines executing multi-turn …
language models (LLMs). However, existing LLM serving engines executing multi-turn …