Edge AI: A taxonomy, systematic review and future directions

SS Gill, M Golec, J Hu, M Xu, J Du, H Wu, GK Walia… - Cluster …, 2025 - Springer
Abstract Edge Artificial Intelligence (AI) incorporates a network of interconnected systems
and devices that receive, cache, process, and analyse data in close communication with the …

A survey of state-of-the-art on edge computing: Theoretical models, technologies, directions, and development paths

B Liu, Z Luo, H Chen, C Li - IEEE Access, 2022 - ieeexplore.ieee.org
In order to describe the roadmap of current edge computing research activities, we first
address a brief overview of the most advanced edge computing surveys published in the last …

Offline energy-optimal llm serving: Workload-based energy models for llm inference on heterogeneous systems

G Wilkins, S Keshav, R Mortier - arxiv preprint arxiv:2407.04014, 2024 - arxiv.org
The rapid adoption of large language models (LLMs) has led to significant advances in
natural language processing and text generation. However, the energy consumed through …

ChainNet: A Customized Graph Neural Network Model for Loss-Aware Edge AI Service Deployment

Z Niu, M Roveri, G Casale - 2024 54th Annual IEEE/IFIP …, 2024 - ieeexplore.ieee.org
Edge AI seeks for the deployment of deep neural network (DNN) based services across
distributed edge devices, embedding intelligence close to data sources. Due to capacity …

Deepslos for the computing continuum

V Casamayor Pujol, B Sedlak, Y Xu, PK Donta… - Proceedings of the …, 2024 - dl.acm.org
The advent of the computing continuum, ie, the blending of all existing computational tiers,
calls for novel techniques and methods that consider its complex dynamics. This work …

SPACE4AI-R: a runtime management tool for AI applications component placement and resource scaling in computing continua

F Filippini, H Sedghani, D Ardagna - Proceedings of the IEEE/ACM 16th …, 2023 - dl.acm.org
The recent migration towards Internet of Things determined the rise of a Computing
Continuum paradigm where Edge and Cloud resources coordinate to support the execution …

Understanding the Benefits of Hardware-Accelerated Communication in Model-Serving Applications

WA Hanafy, L Wang, H Chang… - 2023 IEEE/ACM 31st …, 2023 - ieeexplore.ieee.org
It is commonly assumed that the end-to-end networking performance of edge offloading is
purely dictated by that of the network connectivity between end devices and edge computing …

D-STACK: High Throughput DNN Inference by Effective Multiplexing and Spatio-Temporal Scheduling of GPUs

A Dhakal, SG Kulkarni… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Hardware accelerators such as GPUs are required for real-time, low latency inference with
Deep Neural Networks (DNN). Providing inference services in the cloud can be resource …

CloudAIBus: a testbed for AI based cloud computing environments

S Velu, SS Gill, SS Murugesan, H Wu, X Li - Cluster Computing, 2024 - Springer
Smart resource allocation is essential for optimising cloud computing efficiency and
utilisation, but it is also very challenging as traditional approaches often overprovision CPU …

Load-Aware Orchestrator for Edge Computing-Aided Wireless Augmented Reality

W Qian, RWL Coutinho - IEEE Internet of Things Journal, 2024 - ieeexplore.ieee.org
Mobile Augmented Reality (MAR) has gained increased attention thanks to its potential to
transform applications in different domains. One of the challenges to realizing MAR systems …