Evaluating language models for efficient code generation
We introduce Differential Performance Evaluation (DPE), a framework designed to reliably
evaluate Large Language Models (LLMs) for efficient code generation. Traditional coding …
evaluate Large Language Models (LLMs) for efficient code generation. Traditional coding …
Tpugraphs: A performance prediction dataset on large tensor computational graphs
Precise hardware performance models play a crucial role in code optimizations. They can
assist compilers in making heuristic decisions or aid autotuners in identifying the optimal …
assist compilers in making heuristic decisions or aid autotuners in identifying the optimal …
Language models for code optimization: Survey, challenges and future directions
Language models (LMs) built upon deep neural networks (DNNs) have recently
demonstrated breakthrough effectiveness in software engineering tasks like code …
demonstrated breakthrough effectiveness in software engineering tasks like code …
WACO: learning workload-aware co-optimization of the format and schedule of a sparse tensor program
In this paper, we present WACO, a novel method of co-optimizing the format and the
schedule of a given sparsity pattern in a sparse tensor program. A core challenge in this …
schedule of a given sparsity pattern in a sparse tensor program. A core challenge in this …
Supersonic: Learning to generate source code optimizations in C/C++
Software optimization refines programs for resource efficiency while preserving functionality.
Traditionally, it is a process done by developers and compilers. This paper introduces a third …
Traditionally, it is a process done by developers and compilers. This paper introduces a third …
Tenset: A large-scale program performance dataset for learned tensor compilers
Search-based tensor compilers can greatly accelerate the execution of machine learning
models by generating high-performance tensor programs, such as matrix multiplications and …
models by generating high-performance tensor programs, such as matrix multiplications and …
A flexible approach to autotuning multi-pass machine learning compilers
Search-based techniques have been demonstrated effective in solving complex optimization
problems that arise in domain-specific compilers for machine learning (ML). Unfortunately …
problems that arise in domain-specific compilers for machine learning (ML). Unfortunately …
Tensor program optimization with probabilistic programs
Automatic optimization for tensor programs becomes increasingly important as we deploy
deep learning in various environments, and efficient optimization relies on a rich search …
deep learning in various environments, and efficient optimization relies on a rich search …
Tlp: A deep learning-based cost model for tensor program tuning
Tensor program tuning is a non-convex objective optimization problem, to which search-
based approaches have proven to be effective. At the core of the search-based approaches …
based approaches have proven to be effective. At the core of the search-based approaches …
PyGim: An Efficient Graph Neural Network Library for Real Processing-In-Memory Architectures
Graph Neural Networks (GNNs) are emerging models to analyze graph-structure data. GNN
execution involves both compute-intensive and memory-intensive kernels. The latter kernels …
execution involves both compute-intensive and memory-intensive kernels. The latter kernels …