The digital revolution of Earth-system science
Computational science is crucial for delivering reliable weather and climate predictions.
However, despite decades of high-performance computing experience, there is serious …
However, despite decades of high-performance computing experience, there is serious …
Instead of rewriting foreign code for machine learning, automatically synthesize fast gradients
Applying differentiable programming techniques and machine learning algorithms to foreign
programs requires developers to either rewrite their code in a machine learning framework …
programs requires developers to either rewrite their code in a machine learning framework …
Productivity, portability, performance: Data-centric Python
Python has become the de facto language for scientific computing. Programming in Python
is highly productive, mainly due to its rich science-oriented software ecosystem built around …
is highly productive, mainly due to its rich science-oriented software ecosystem built around …
Scalable distributed high-order stencil computations
Stencil computations lie at the heart of many scientific and industrial applications. Stencil
algorithms pose several challenges on machines with cache based memory hierarchy, due …
algorithms pose several challenges on machines with cache based memory hierarchy, due …
Progressive raising in multi-level ir
Multi-level intermediate representations (IR) show great promise for lowering the design
costs for domain-specific compilers by providing a reusable, extensible, and non-opini …
costs for domain-specific compilers by providing a reusable, extensible, and non-opini …
Toward accelerated stencil computation by adapting tensor core unit on gpu
The Tensor Core Unit (TCU) has been increasingly adopted on modern high performance
processors, specialized in boosting the performance of general matrix multiplication …
processors, specialized in boosting the performance of general matrix multiplication …
Automatic creation of high-bandwidth memory architectures from domain-specific languages: The case of computational fluid dynamics
Numerical simulations can help solve complex problems. Most of these algorithms are
massively parallel and thus good candidates for FPGA acceleration thanks to spatial …
massively parallel and thus good candidates for FPGA acceleration thanks to spatial …
High-performance gpu-to-cpu transpilation and optimization via high-level parallel constructs
While parallelism remains the main source of performance, architectural implementations
and programming models change with each new hardware generation, often leading to …
and programming models change with each new hardware generation, often leading to …
Productive performance engineering for weather and climate modeling with python
T Ben-Nun, L Groner, F Deconinck… - … Conference for High …, 2022 - ieeexplore.ieee.org
Earth system models are developed with a tight coupling to target hardware, often
containing specialized code predicated on processor characteristics. This coupling stems …
containing specialized code predicated on processor characteristics. This coupling stems …
OCC: An automated end-to-end machine learning optimizing compiler for computing-in-memory
Memristive devices promise an alternative approach toward non-Von Neumann
architectures, where specific computational tasks are performed within the memory devices …
architectures, where specific computational tasks are performed within the memory devices …