An evaluation of edge tpu accelerators for convolutional neural networks
Edge TPUs are a domain of accelerators for low-power, edge devices and are widely used
in various Google products such as Coral and Pixel devices. In this paper, we first discuss …
in various Google products such as Coral and Pixel devices. In this paper, we first discuss …
GRANITE: A graph neural network model for basic block throughput estimation
Analytical hardware performance models yield swift estimation of desired hardware
performance metrics. However, develo** these analytical models for modern processors …
performance metrics. However, develo** these analytical models for modern processors …
[PDF][PDF] El Criterio de Informació n de Akaike en la Obtenció n de Modelos Estadısticos de Rendimiento
Este artıculo presenta un método de obtención de modelos estadısticos de rendimiento de
aplicaciones paralelas basado en la selección de modelos mediante el criterio de …
aplicaciones paralelas basado en la selección de modelos mediante el criterio de …
Analytical estimation of the scalability of iterative numerical algorithms on distributed memory multiprocessors
LB Sokolinsky - Lobachevskii Journal of Mathematics, 2018 - Springer
This article presents a new high-level parallel computational model named BSF"—Bulk
Synchronous Farm. The BSF model extends the BSP model to deal with the …
Synchronous Farm. The BSF model extends the BSP model to deal with the …
Accurate analytical performance model of communications in MPI applications
This paper presents a new LogP-based model, called LoOgGP, which allows an accurate
characterization of MPI applications based on microbenchmark measurements. This new …
characterization of MPI applications based on microbenchmark measurements. This new …
Parallel execution time prediction of the multitask parallel programs
R Wu, J Sun, J Chen - Performance Evaluation, 2008 - Elsevier
A critical problem of predicting the execution time of parallel programs is computing the
maximum execution time of tasks involved in the parallel computation. For a parallel …
maximum execution time of tasks involved in the parallel computation. For a parallel …
Analytical performance models of parallel programs in clusters
This paper presents a framework based on an user driven methodology to obtain analytical
models on parallel systems and, in particular, clusters. This framework consists of two …
models on parallel systems and, in particular, clusters. This framework consists of two …
Towards automated construction of compiler optimizations
TCY Mendis - 2020 - dspace.mit.edu
First, we present goSLP, a framework that uses integer linear programming to find a globally
pairwise-optimal statement packing strategy to achieve superior vectorization performance …
pairwise-optimal statement packing strategy to achieve superior vectorization performance …
Performance modeling of mpi applications using model selection techniques
A new method for obtaining models of the performance of parallel applications based on
statistical analysis is presented in this paper. This method is based on the Akaike's …
statistical analysis is presented in this paper. This method is based on the Akaike's …
Software tools for performance modeling of parallel programs
This paper presents a framework based on a user driven methodology to obtain analytical
models of MPI applications on parallel systems in a systematic and easy to use way. This …
models of MPI applications on parallel systems in a systematic and easy to use way. This …