Programl: A graph-based program representation for data flow analysis and compiler optimizations
Abstract Machine learning (ML) is increasingly seen as a viable approach for building
compiler optimization heuristics, but many ML methods cannot replicate even the simplest of …
compiler optimization heuristics, but many ML methods cannot replicate even the simplest of …
A survey of techniques for dynamic branch prediction
S Mittal - Concurrency and Computation: Practice and …, 2019 - Wiley Online Library
Branch predictor (BP) is an essential component in modern processors since high BP
accuracy can improve performance and reduce energy by decreasing the number of …
accuracy can improve performance and reduce energy by decreasing the number of …
Qiskit pulse: programming quantum computers through the cloud with pulses
The quantum circuit model is an abstraction that hides the underlying physical
implementation of gates and measurements on a quantum computer. For precise control of …
implementation of gates and measurements on a quantum computer. For precise control of …
Deepbindiff: Learning program-wide code representations for binary diffing
Binary diffing analysis quantitatively measures the differences between two given binaries
and produces fine-grained basic block matching. It has been widely used to enable different …
and produces fine-grained basic block matching. It has been widely used to enable different …
AUGEM: automatically generate high performance dense linear algebra kernels on x86 CPUs
Basic Liner algebra subprograms (BLAS) is a fundamental library in scientific computing. In
this paper, we present a template-based optimization framework, AUGEM, which can …
this paper, we present a template-based optimization framework, AUGEM, which can …
TRIMMER: application specialization for code debloating
With the proliferation of new hardware architectures and ever-evolving user requirements,
the software stack is becoming increasingly bloated. In practice, only a limited subset of the …
the software stack is becoming increasingly bloated. In practice, only a limited subset of the …
[BOK][B] Heterogeneous computing with OpenCL 2.0
Heterogeneous Computing with OpenCL 2.0 teaches OpenCL and parallel programming for
complex systems that may include a variety of device architectures: multi-core CPUs, GPUs …
complex systems that may include a variety of device architectures: multi-core CPUs, GPUs …
QED at large: A survey of engineering of formally verified software
Abstract Development of formal proofs of correctness of programs can increase actual and
perceived reliability and facilitate better understanding of program specifications and their …
perceived reliability and facilitate better understanding of program specifications and their …
Finding effective compilation sequences
Most modern compilers operate by applying a fixed, program-independent sequence of
optimizations to all programs. Compiler writers choose a single" compilation sequence", or …
optimizations to all programs. Compiler writers choose a single" compilation sequence", or …
Unleashing SmartNIC packet processing performance in P4
SmartNICs are on the rise as a packet processing platform, with the trend towards a uniform
P4 programming model. However, unleashing SmartNIC packet processing performance in …
P4 programming model. However, unleashing SmartNIC packet processing performance in …