Polly—performing polyhedral optimizations on a low-level intermediate representation
T Grosser, A Groesslinger, C Lengauer - Parallel Processing Letters, 2012 - World Scientific
The polyhedral model for loop parallelization has proved to be an effective tool for advanced
optimization and automatic parallelization of programs in higher-level languages. Yet, to …
optimization and automatic parallelization of programs in higher-level languages. Yet, to …
Generating efficient quantum chemistry codes for novel architectures
AV Titov, IS Ufimtsev, N Luehr… - Journal of chemical …, 2013 - ACS Publications
We describe an extension of our graphics processing unit (GPU) electronic structure
program TeraChem to include atom-centered Gaussian basis sets with d angular …
program TeraChem to include atom-centered Gaussian basis sets with d angular …
Massively parallel processing core with plural chains of processing elements and respective smart memory storing select data received from each chain
S Cadambi, A Majumdar, M Becchi… - US Patent …, 2013 - Google Patents
An accelerator System is shown that includes a plurality of processing cores. Each
processing core includes a plurality of processing chains configured to perform parallel …
processing core includes a plurality of processing chains configured to perform parallel …
Quantum chemistry on graphical processing units. 1. Strategies for two-electron integral evaluation
IS Ufimtsev, TJ Martinez - Journal of Chemical Theory and …, 2008 - ACS Publications
Modern videogames place increasing demands on the computational and graphical
hardware, leading to novel architectures that have great potential in the context of high …
hardware, leading to novel architectures that have great potential in the context of high …
Do code clones matter?
E Juergens, F Deissenboeck… - 2009 IEEE 31st …, 2009 - ieeexplore.ieee.org
Code cloning is not only assumed to inflate maintenance costs but also considered defect-
prone as inconsistent changes to code duplicates can lead to unexpected behavior …
prone as inconsistent changes to code duplicates can lead to unexpected behavior …
Quantum chemistry on graphical processing units. 2. Direct self-consistent-field implementation
IS Ufimtsev, TJ Martinez - Journal of Chemical Theory and …, 2009 - ACS Publications
We demonstrate the use of graphical processing units (GPUs) to carry out complete self-
consistent-field calculations for molecules with as many as 453 atoms (2131 basis …
consistent-field calculations for molecules with as many as 453 atoms (2131 basis …
Exploiting coarse-grained task, data, and pipeline parallelism in stream programs
MI Gordon, W Thies, S Amarasinghe - ACM SIGPLAN Notices, 2006 - dl.acm.org
As multicore architectures enter the mainstream, there is a pressing demand for high-level
programming models that can effectively map to them. Stream programming offers an …
programming models that can effectively map to them. Stream programming offers an …
Data-parallel hashing techniques for GPU architectures
Hash tables are a fundamental data structure for effectively storing and accessing sparse
data, with widespread usage in domains ranging from computer graphics to machine …
data, with widespread usage in domains ranging from computer graphics to machine …
Methods for evaluating and covering the design space during early design development
M Gries - Integration, 2004 - Elsevier
This paper gives an overview of methods used for design space exploration (DSE) of micro-
architectures and systems. The DSE problem generally considers two orthogonal issues:(I) …
architectures and systems. The DSE problem generally considers two orthogonal issues:(I) …
Excited-state electronic structure with configuration interaction singles and Tamm–Dancoff time-dependent density functional theory on graphical processing units
CM Isborn, N Luehr, IS Ufimtsev… - Journal of Chemical …, 2011 - ACS Publications
Excited-state calculations are implemented in a development version of the GPU-based
TeraChem software package using the configuration interaction singles (CIS) and adiabatic …
TeraChem software package using the configuration interaction singles (CIS) and adiabatic …