Deferred continuous batching in resource-efficient large language model serving
Despite that prior work of batched inference and parameter-efficient fine-tuning techniques
have reduced the resource requirements of large language models (LLMs), challenges …
have reduced the resource requirements of large language models (LLMs), challenges …
Automation of AD-OHC Dashbord and Monitoring of Cloud Resources using Genrative AI to Reduce Costing and Enhance Performance
P Chavan, P Chavan - 2024 International Conference on …, 2024 - ieeexplore.ieee.org
In this extensive review, the incorporation of Generative Artificial Intelligence (AI) into ad-hoc
dashboards and cloud resource monitoring is investigated in depth. A purpose of this work is …
dashboards and cloud resource monitoring is investigated in depth. A purpose of this work is …
[HTML][HTML] State of the Art in Parallel and Distributed Systems: Emerging Trends and Challenges
Driven by rapid advancements in interconnection, packaging, integration, and computing
technologies, parallel and distributed systems have significantly evolved in recent years …
technologies, parallel and distributed systems have significantly evolved in recent years …
BOOM: Use your Desktop to Accurately Predict the Performance of Large Deep Neural Networks
The intensive computational requirements of training deep neural networks (DNNs) have
significantly driven the adoption of DNN accelerators like Graph Processing Units (GPU) …
significantly driven the adoption of DNN accelerators like Graph Processing Units (GPU) …
Large Generative Model-enabled Digital Twin for 6G Networks
Y Yang, W Sun, J He, Y Fu, L Xu - IEEE Network, 2024 - ieeexplore.ieee.org
The next generation (6G) wireless networks are under intensive research and envisioned to
realize the interconnection of everything and ubiquitous intelligence. One of the major …
realize the interconnection of everything and ubiquitous intelligence. One of the major …
Performance evaluation of cloud database in terms of response time using tenancy model and in-memory database
A Shah, M Patel, M Patel - 2024 IEEE International Conference …, 2024 - ieeexplore.ieee.org
Cloud databases are now essential parts of contemporary information systems, providing
scalability, flexibility, and cost-effectiveness to enterprises in various industries. Cloud …
scalability, flexibility, and cost-effectiveness to enterprises in various industries. Cloud …
[PDF][PDF] Enhancing Operational Data Synthesis and Predictive Analysis in HPC Clusters Using Large Language Models
Y Zang - 2024 - atlarge-research.com
Abstract High-Performance Computing (HPC) clusters are integral to advancing scientific
research, industrial optimization, and various computational tasks. Researchers, industrial …
research, industrial optimization, and various computational tasks. Researchers, industrial …
Intelligent Network Optimization in Cloud Environments with Generative AI and LLMs
K Patil, B Desai - 2024 - preprints.org
This paper represents a groundbreaking paradigm shift in network optimization. Departing
from traditional static methodologies, this innovative approach harnesses the power of …
from traditional static methodologies, this innovative approach harnesses the power of …
Generative AI Meets Cloud Networking: A New Era of Dynamic Optimization
H Miyamoto, SNA Tan - Asian American Research Letters Journal, 2024 - aarlj.com
This paper represents a groundbreaking paradigm shift in network optimization. Departing
from traditional static methodologies, this innovative approach harnesses the power of …
from traditional static methodologies, this innovative approach harnesses the power of …