- Academic Search

P Patel, E Choukse, C Zhang, A Shah… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org

Generative large language model (LLM) applications are growing rapidly, leading to large-
scale deployments of expensive and power-hungry GPUs. Our characterization of LLM …

Lưu Trích dẫn Trích dẫn 111 bài viết Bài viết có liên quan Tất cả 6 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Llm inference serving: Survey of recent advances and opportunities

B Li, Y Jiang, V Gadepally, D Tiwari - arxiv preprint arxiv:2407.12391, 2024 - arxiv.org

This survey offers a comprehensive overview of recent advancements in Large Language
Model (LLM) serving systems, focusing on research since the year 2023. We specifically …

Lưu Trích dẫn Trích dẫn 16 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Dynamollm: Designing llm inference clusters for performance and energy efficiency

J Stojkovic, C Zhang, Í Goiri, J Torrellas… - arxiv preprint arxiv …, 2024 - arxiv.org

The rapid evolution and widespread adoption of generative large language models (LLMs)
have made them a pivotal workload in various applications. Today, LLM inference clusters …

Lưu Trích dẫn Trích dẫn 23 bài viết Bài viết có liên quan Tất cả 4 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Offline energy-optimal llm serving: Workload-based energy models for llm inference on heterogeneous systems

G Wilkins, S Keshav, R Mortier - arxiv preprint arxiv:2407.04014, 2024 - arxiv.org

The rapid adoption of large language models (LLMs) has led to significant advances in
natural language processing and text generation. However, the energy consumed through …

Lưu Trích dẫn Trích dẫn 8 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

Reconciling the contrasting narratives on the environmental impact of large language models

S Ren, B Tomlinson, RW Black, AW Torrance - Scientific Reports, 2024 - nature.com

The recent proliferation of large language models (LLMs) has led to divergent narratives
about their environmental impacts. Some studies highlight the substantial carbon footprint of …

Lưu Trích dẫn Trích dẫn 2 bài viết Bài viết có liên quan Tất cả 7 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Perllm: Personalized inference scheduling with edge-cloud collaboration for diverse llm services

Z Yang, Y Yang, C Zhao, Q Guo, W He, W Ji - arxiv preprint arxiv …, 2024 - arxiv.org

With the rapid growth in the number of large language model (LLM) users, it is difficult for
bandwidth-constrained cloud servers to simultaneously process massive LLM services in …

Lưu Trích dẫn Trích dẫn 9 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

TAPAS: Thermal-and Power-Aware Scheduling for LLM Inference in Cloud Platforms

J Stojkovic, C Zhang, Í Goiri, E Choukse, H Qiu… - arxiv preprint arxiv …, 2025 - arxiv.org

The rising demand for generative large language models (LLMs) poses challenges for
thermal and power management in cloud datacenters. Traditional techniques often are …

Lưu Trích dẫn Trích dẫn 3 bài viết Bài viết có liên quan Tất cả 2 phiên bản Xem dạng HTML

A survey of small language models

C Van Nguyen, X Shen, R Aponte, Y **a… - arxiv preprint arxiv …, 2024 - arxiv.org

Small Language Models (SLMs) have become increasingly important due to their efficiency
and performance to perform various language tasks with minimal computational resources …

Lưu Trích dẫn Trích dẫn 4 bài viết Bài viết có liên quan Tất cả 3 phiên bản Bản lưu

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The unseen AI disruptions for power grids: LLM-induced transients

Y Li, M Mughees, Y Chen, YR Li - arxiv preprint arxiv:2409.11416, 2024 - arxiv.org

Recent breakthroughs of large language models (LLMs) have exhibited superior capability
across major industries and stimulated multi-hundred-billion-dollar investment in AI-centric …

Lưu Trích dẫn Trích dẫn 5 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

Datacenter power and energy management: past, present, and future

R Bianchini, C Belady, A Sivasubramaniam - IEEE Micro, 2024 - ieeexplore.ieee.org

This article overviews some of the key past developments in cloud data center power and
energy management, where we are today, and what the future could be. This topic is gaining …

Lưu Trích dẫn Trích dẫn 3 bài viết Bài viết có liên quan Tất cả 5 phiên bản

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Characterizing power management opportunities for llms in the cloud

Splitwise: Efficient generative llm inference using phase splitting

Llm inference serving: Survey of recent advances and opportunities

Dynamollm: Designing llm inference clusters for performance and energy efficiency

Offline energy-optimal llm serving: Workload-based energy models for llm inference on heterogeneous systems

Reconciling the contrasting narratives on the environmental impact of large language models

Perllm: Personalized inference scheduling with edge-cloud collaboration for diverse llm services

TAPAS: Thermal-and Power-Aware Scheduling for LLM Inference in Cloud Platforms

A survey of small language models

The unseen AI disruptions for power grids: LLM-induced transients

Datacenter power and energy management: past, present, and future