[PDF][PDF] CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices.

F Jia, D Zhang, T Cao, S Jiang, Y Liu, J Ren, Y Zhang - MobiSys, 2022 - chrisplus.me
Concurrent inference execution on heterogeneous processors is critical to improve the
performance of increasingly heavy deep learning (DL) models. However, available …

A comprehensive benchmark of deep learning libraries on mobile devices

Q Zhang, X Li, X Che, X Ma, A Zhou, M Xu… - Proceedings of the …, 2022 - dl.acm.org
Deploying deep learning (DL) on mobile devices has been a notable trend in recent years.
To support fast inference of on-device DL, DL libraries play a critical role as algorithms and …

A comprehensive deep learning library benchmark and optimal library selection

Q Zhang, X Che, Y Chen, X Ma, M Xu… - IEEE Transactions …, 2023 - ieeexplore.ieee.org
Deploying deep learning (DL) on mobile devices has been a notable trend in recent years.
To support fast inference of on-device DL, DL libraries play a critical role as algorithms and …

**net: Efficient neural networks for tinyml

A Ancilotto, F Paissan, E Farella - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
The recent interest in the edge-to-cloud continuum paradigm has emphasized the need for
simple and scalable architectures to deliver optimal performance on computationally …

Towards efficient vision transformer inference: A first study of transformers on mobile devices

X Wang, LL Zhang, Y Wang, M Yang - Proceedings of the 23rd annual …, 2022 - dl.acm.org
Convolution neural networks (CNNs) have long been dominating the model choice in on-
device intelligent mobile applications. Recently, we are witnessing the fast development of …

Deepperform: An efficient approach for performance testing of resource-constrained neural networks

S Chen, M Haque, C Liu, W Yang - Proceedings of the 37th IEEE/ACM …, 2022 - dl.acm.org
Today, an increasing number of Adaptive Deep Neural Networks (AdNNs) are being used
on resource-constrained embedded devices. We observe that, similar to traditional software …

Blastnet: Exploiting duo-blocks for cross-processor real-time dnn inference

N Ling, X Huang, Z Zhao, N Guan, Z Yan… - Proceedings of the 20th …, 2022 - dl.acm.org
In recent years, Deep Neural Network (DNN) has been increasingly adopted by a wide
range of time-critical applications running on edge platforms with heterogeneous …

BitSET: Bit-serial early termination for computation reduction in convolutional neural networks

Y Pan, J Yu, A Lukefahr, R Das, S Mahlke - ACM Transactions on …, 2023 - dl.acm.org
Convolutional Neural Networks (CNNs) have demonstrated remarkable performance across
a wide range of machine learning tasks. However, the high accuracy usually comes at the …

Rt-mdl: Supporting real-time mixed deep learning tasks on edge platforms

N Ling, K Wang, Y He, G **ng, D **e - … of the 19th ACM conference on …, 2021 - dl.acm.org
Recent years have witnessed an emerging class of real-time applications, eg, autonomous
driving, in which resource-constrained edge platforms need to execute a set of real-time …

Context-aware compilation of dnn training pipelines across edge and cloud

D Yao, L **ang, Z Wang, J Xu, C Li… - Proceedings of the ACM on …, 2021 - dl.acm.org
Empowered by machine learning, edge devices including smartphones, wearable, and IoT
devices have become growingly intelligent, raising conflicts with the limited resource. On …