Scalable algorithms for molecular dynamics simulations on commodity clusters

KJ Bowers, E Chow, H Xu, RO Dror… - Proceedings of the …, 2006 - dl.acm.org
Although molecular dynamics (MD) simulations of biomolecular systems often run for days to
months, many events of great scientific interest and pharmaceutical relevance occur on long …

Addressing failures in exascale computing

M Snir, RW Wisniewski, JA Abraham… - … Journal of High …, 2014 - journals.sagepub.com
We present here a report produced by a workshop on 'Addressing failures in exascale
computing'held in Park City, Utah, 4–11 August 2012. The charter of this workshop was to …

Architecture exploration for ambient energy harvesting nonvolatile processors

K Ma, Y Zheng, S Li, K Swaminathan… - 2015 IEEE 21st …, 2015 - ieeexplore.ieee.org
Energy harvesting has been widely investigated as a promising method of providing power
for ultra-low-power applications. Such energy sources include solar energy, radio-frequency …

Deterministic replay: A survey

Y Chen, S Zhang, Q Guo, L Li, R Wu… - ACM Computing Surveys …, 2015 - dl.acm.org
Deterministic replay is a type of emerging technique dedicated to providing deterministic
executions of computer programs in the presence of nondeterministic factors. The …

ThyNVM: Enabling software-transparent crash consistency in persistent memory systems

J Ren, J Zhao, S Khan, J Choi, Y Wu… - Proceedings of the 48th …, 2015 - dl.acm.org
Emerging byte-addressable nonvolatile memories (NVMs) promise persistent memory,
which allows processors to directly access persistent data in main memory. Yet, persistent …

Detailed design and evaluation of redundant multithreading alternatives

SS Mukherjee, M Kontz, SK Reinhardt - ACM SIGARCH Computer …, 2002 - dl.acm.org
Exponential growth in the number of on-chip transistors, coupled with reductions in voltage
levels, makes each generation of microprocessors increasingly vulnerable to transient faults …

[LIBRO][B] Architecture design for soft errors

S Mukherjee - 2011 - books.google.com
Architecture Design for Soft Errors provides a comprehensive description of the architectural
techniques to tackle the soft error problem. It covers the new methodologies for quantitative …

DMTCP: Transparent checkpointing for cluster computations and the desktop

J Ansel, K Arya, G Cooperman - 2009 IEEE international …, 2009 - ieeexplore.ieee.org
DMTCP (distributed multithreaded checkpointing) is a transparent user-level checkpointing
package for distributed applications. Checkpointing and restart is demonstrated for a wide …

A" flight data recorder" for enabling full-system multiprocessor deterministic replay

M Xu, R Bodik, MD Hill - Proceedings of the 30th annual international …, 2003 - dl.acm.org
Debuggers have been proven indispensable in improving software reliability. Unfortunately,
on most real-life software, debuggers fail to deliver their most essential feature---a faithful …

Bugnet: Continuously recording program execution for deterministic replay debugging

S Narayanasamy, G Pokam… - … Symposium on Computer …, 2005 - ieeexplore.ieee.org
Significant time is spent by companies trying to reproduce and fix the bugs that occur for
released code. To assist developers, we propose the BugNet architecture to continuously …