[PDF][PDF] CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices.
Concurrent inference execution on heterogeneous processors is critical to improve the
performance of increasingly heavy deep learning (DL) models. However, available …
performance of increasingly heavy deep learning (DL) models. However, available …
A comprehensive benchmark of deep learning libraries on mobile devices
Deploying deep learning (DL) on mobile devices has been a notable trend in recent years.
To support fast inference of on-device DL, DL libraries play a critical role as algorithms and …
To support fast inference of on-device DL, DL libraries play a critical role as algorithms and …
A comprehensive deep learning library benchmark and optimal library selection
Deploying deep learning (DL) on mobile devices has been a notable trend in recent years.
To support fast inference of on-device DL, DL libraries play a critical role as algorithms and …
To support fast inference of on-device DL, DL libraries play a critical role as algorithms and …
**net: Efficient neural networks for tinyml
The recent interest in the edge-to-cloud continuum paradigm has emphasized the need for
simple and scalable architectures to deliver optimal performance on computationally …
simple and scalable architectures to deliver optimal performance on computationally …
Towards efficient vision transformer inference: A first study of transformers on mobile devices
Convolution neural networks (CNNs) have long been dominating the model choice in on-
device intelligent mobile applications. Recently, we are witnessing the fast development of …
device intelligent mobile applications. Recently, we are witnessing the fast development of …
Deepperform: An efficient approach for performance testing of resource-constrained neural networks
Today, an increasing number of Adaptive Deep Neural Networks (AdNNs) are being used
on resource-constrained embedded devices. We observe that, similar to traditional software …
on resource-constrained embedded devices. We observe that, similar to traditional software …
Blastnet: Exploiting duo-blocks for cross-processor real-time dnn inference
In recent years, Deep Neural Network (DNN) has been increasingly adopted by a wide
range of time-critical applications running on edge platforms with heterogeneous …
range of time-critical applications running on edge platforms with heterogeneous …
BitSET: Bit-serial early termination for computation reduction in convolutional neural networks
Convolutional Neural Networks (CNNs) have demonstrated remarkable performance across
a wide range of machine learning tasks. However, the high accuracy usually comes at the …
a wide range of machine learning tasks. However, the high accuracy usually comes at the …
Rt-mdl: Supporting real-time mixed deep learning tasks on edge platforms
Recent years have witnessed an emerging class of real-time applications, eg, autonomous
driving, in which resource-constrained edge platforms need to execute a set of real-time …
driving, in which resource-constrained edge platforms need to execute a set of real-time …
Context-aware compilation of dnn training pipelines across edge and cloud
Empowered by machine learning, edge devices including smartphones, wearable, and IoT
devices have become growingly intelligent, raising conflicts with the limited resource. On …
devices have become growingly intelligent, raising conflicts with the limited resource. On …