Checkmate: Breaking the memory wall with optimal tensor rematerialization
Modern neural networks are increasingly bottlenecked by the limited capacity of on-device
GPU memory. Prior work explores drop** activations as a strategy to scale to larger neural …
GPU memory. Prior work explores drop** activations as a strategy to scale to larger neural …
[BOK][B] Engineering a compiler
KD Cooper, L Torczon - 2022 - books.google.com
Engineering a Compiler, Third Edition covers the latest developments in compiler
technology, with new chapters focusing on semantic elaboration (the problems that arise in …
technology, with new chapters focusing on semantic elaboration (the problems that arise in …
[BOK][B] Register allocation via graph coloring
P Briggs - 1992 - search.proquest.com
Chaitin and his colleagues at IBM in Yorktown Heights built the first global register allocator
based on graph coloring. This thesis describes a series of improvements and extensions to …
based on graph coloring. This thesis describes a series of improvements and extensions to …
Intermediate representations in imperative compilers: A survey
J Stanier, D Watson - ACM Computing Surveys (CSUR), 2013 - dl.acm.org
Compilers commonly translate an input program into an intermediate representation (IR)
before optimizing it and generating code. Over time there have been a number of different …
before optimizing it and generating code. Over time there have been a number of different …
[BOK][B] The compiler design handbook: optimizations and machine code generation
YN Srikant, P Shankar - 2002 - taylorfrancis.com
The widespread use of object-oriented languages and Internet security concerns are just the
beginning. Add embedded systems, multiple memory banks, highly pipelined units …
beginning. Add embedded systems, multiple memory banks, highly pipelined units …
[BOK][B] Code generation for embedded processors
P Marwedel, G Goossens - 2013 - books.google.com
Modern electronics is driven by the explosive growth of digital communications and multi-
media technology. A basic challenge is to design first-time-right complex digital systems, that …
media technology. A basic challenge is to design first-time-right complex digital systems, that …
Value dependence graphs: Representation without taxation
D Weise, RF Crew, M Ernst… - Proceedings of the 21st …, 1994 - dl.acm.org
The value dependence graph (VDG) is a sparse dataflow-like representation that simplifies
program analysis and transformation. It is a functional representation that represents control …
program analysis and transformation. It is a functional representation that represents control …
An efficient representation for sparse sets
P Briggs, L Torczon - ACM Letters on Programming Languages and …, 1993 - dl.acm.org
Sets are a fundamental abstraction widely used in programming. Many representations are
possible, each offering different advantages. We describe a representation that supports …
possible, each offering different advantages. We describe a representation that supports …
Efficient rematerialization for deep networks
When training complex neural networks, memory usage can be an important bottleneck. The
question of when to rematerialize, ie, to recompute intermediate values rather than retaining …
question of when to rematerialize, ie, to recompute intermediate values rather than retaining …
Efficient algorithms for computing the longest viable path in a combinational network
PC McGeer, RK Brayton - Proceedings of the 26th ACM/IEEE Design …, 1989 - dl.acm.org
We consider the elimination of false paths in combinational circuits. We give the single
generic algorithm that is used to solve this problem, and demonstrate that it is parameterized …
generic algorithm that is used to solve this problem, and demonstrate that it is parameterized …