Google Acadèmic

L Zheng, WL Chiang, Y Sheng, T Li, S Zhuang… - arxiv preprint arxiv …, 2023 - arxiv.org

Studying how people interact with large language models (LLMs) in real-world scenarios is
increasingly important due to their widespread use in various applications. In this paper, we …

Desa Cita Citat per 114 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards efficient generative large language model serving: A survey from algorithms to systems

X Miao, G Oliaro, Z Zhang, X Cheng, H **… - arxiv preprint arxiv …, 2023 - arxiv.org

In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …

Desa Cita Citat per 73 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Automix: Automatically mixing language models

P Aggarwal, A Madaan, A Anand, SP Potharaju… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) are now available from cloud API providers in various sizes
and configurations. While this diversity offers a broad spectrum of choices, effectively …

Desa Cita Citat per 19 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

Enhancing on-device llm inference with historical cloud-based llm interactions

Y Ding, C Niu, F Wu, S Tang, C Lyu… - Proceedings of the 30th …, 2024 - dl.acm.org

Many billion-scale large language models (LLMs) have been released for resource-
constraint mobile devices to provide local LLM inference service when cloud-based …

Desa Cita Citat per 4 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek

Graphrouter: A graph-based router for llm selections

T Feng, Y Shen, J You - arxiv preprint arxiv:2410.03834, 2024 - arxiv.org

The rapidly growing number and variety of Large Language Models (LLMs) present
significant challenges in efficiently selecting the appropriate LLM for a given query …

Desa Cita Citat per 5 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek A la memòria cau

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Model compression and efficient inference for large language models: A survey

W Wang, W Chen, Y Luo, Y Long, Z Lin… - arxiv preprint arxiv …, 2024 - arxiv.org

Transformer based large language models have achieved tremendous success. However,
the significant memory and computational costs incurred during the inference process make …

Desa Cita Citat per 27 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cache & distil: Optimising API calls to large language models

G Ramírez, M Lindemann, A Birch, I Titov - arxiv preprint arxiv …, 2023 - arxiv.org

Large-scale deployment of generative AI tools often depends on costly API calls to a Large
Language Model (LLM) to fulfil user queries. To curtail the frequency of these calls, one can …

Desa Cita Citat per 9 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Teola: Towards end-to-end optimization of llm-based applications

X Tan, Y Jiang, Y Yang, H Xu - arxiv preprint arxiv:2407.00326, 2024 - arxiv.org

Large language model (LLM)-based applications consist of both LLM and non-LLM
components, each contributing to the end-to-end latency. Despite great efforts to optimize …

Desa Cita Citat per 4 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A Survey on Effective Invocation Methods of Massive LLM Services

C Wang, B Zhang, D Sui, Z Tum, X Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Language models as a service (LMaaS) enable users to accomplish tasks without requiring
specialized knowledge, simply by paying a service provider. However, numerous providers …

Desa Cita Citat per 5 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Leveraging LLMs for optimised feature selection and embedding in structured data: A case study on graduate employment classification

R Haque, HN Goh, CY Ting, A Quek… - Computers and Education …, 2025 - Elsevier

Abstract The application of Machine Learning (ML) for predicting graduate student
employability is a growing area of research, driven by the need to align educational …

Desa Cita Articles relacionats

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Lmsys-chat-1m: A large-scale real-world llm conversation dataset

Towards efficient generative large language model serving: A survey from algorithms to systems

Automix: Automatically mixing language models

Enhancing on-device llm inference with historical cloud-based llm interactions

Graphrouter: A graph-based router for llm selections

Model compression and efficient inference for large language models: A survey

Cache & distil: Optimising API calls to large language models

Teola: Towards end-to-end optimization of llm-based applications

A Survey on Effective Invocation Methods of Massive LLM Services

[HTML][HTML] Leveraging LLMs for optimised feature selection and embedding in structured data: A case study on graduate employment classification