Design, modeling, and evaluation of a scalable multi-level checkpointing system A Moody, G Bronevetsky, K Mohror, BR De Supinski SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 845 | 2010 |
Automated application-level checkpointing of MPI programs G Bronevetsky, D Marques, K Pingali, P Stodghill Proceedings of the ninth ACM SIGPLAN symposium on Principles and practice of …, 2003 | 298 | 2003 |
Soft error vulnerability of iterative linear algebra methods G Bronevetsky, B de Supinski Proceedings of the 22nd annual international conference on Supercomputing …, 2008 | 241 | 2008 |
Application-level checkpointing for shared memory programs G Bronevetsky, D Marques, K Pingali, P Szwed, M Schulz ACM SIGPLAN Notices 39 (11), 235-247, 2004 | 179 | 2004 |
Algorithmic approaches to low overhead fault detection for sparse linear algebra J Sloan, R Kumar, G Bronevetsky IEEE/IFIP International Conference on Dependable Systems and Networks (DSN …, 2012 | 134 | 2012 |
A scalable and distributed dynamic formal verifier for MPI programs A Vo, S Aananthakrishnan, G Gopalakrishnan, BR De Supinski, M Schulz, ... SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 127 | 2010 |
Implementation and evaluation of a scalable application-level checkpoint-recovery scheme for MPI programs M Schulz, G Bronevetsky, R Fernandes, D Marques, K Pingali, P Stodghill SC'04: Proceedings of the 2004 ACM/IEEE Conference on Supercomputing, 38-38, 2004 | 113 | 2004 |
Communication-sensitive static dataflow for parallel message passing applications G Bronevetsky 2009 International Symposium on Code Generation and Optimization, 1-12, 2009 | 102 | 2009 |
Fault resilience of the algebraic multi-grid solver M Casas, BR de Supinski, G Bronevetsky, M Schulz Proceedings of the 26th ACM international conference on Supercomputing, 91-100, 2012 | 101 | 2012 |
Formal analysis of MPI-based parallel programs G Gopalakrishnan, RM Kirby, S Siegel, R Thakur, W Gropp, E Lusk, ... Communications of the ACM 54 (12), 82-91, 2011 | 101 | 2011 |
Compiler-enhanced incremental checkpointing for openmp applications G Bronevetsky, DJ Marques, KK Pingali, R Rugina, SA McKee Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of …, 2008 | 93 | 2008 |
Collective operations in application-level fault-tolerant MPI G Bronevetsky, D Marques, K Pingali, P Stodghill Proceedings of the 17th annual international conference on Supercomputing …, 2003 | 75 | 2003 |
Run-through stabilization: An MPI proposal for process fault tolerance J Hursey, RL Graham, G Bronevetsky, D Buntinas, H Pritchard, DG Solt Recent Advances in the Message Passing Interface: 18th European MPI Users …, 2011 | 71 | 2011 |
An algorithmic approach to error localization and partial recomputation for low-overhead fault tolerance J Sloan, R Kumar, G Bronevetsky 2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems …, 2013 | 69 | 2013 |
C 3: A System for Automating Application-Level Checkpointing of MPI Programs G Bronevetsky, D Marques, K Pingali, P Stodghill Languages and Compilers for Parallel Computing: 16th International Workshop …, 2004 | 68 | 2004 |
AutomaDeD: Automata-based debugging for dissimilar parallel tasks G Bronevetsky, I Laguna, S Bagchi, BR de Supinski, DH Ahn, M Schulz 2010 IEEE/IFIP International Conference on Dependable Systems & Networks …, 2010 | 66 | 2010 |
Hybrid MPI: efficient message passing for multi-core systems A Friedley, G Bronevetsky, T Hoefler, A Lumsdaine Proceedings of the International Conference on High Performance Computing …, 2013 | 64 | 2013 |
Automatic fault characterization via abnormality-enhanced classification G Bronevetsky, I Laguna, BR de Supinski, S Bagchi IEEE/IFIP International Conference on Dependable Systems and Networks (DSN …, 2012 | 64 | 2012 |
Recent advances in checkpoint/recovery systems G Bronevetsky, R Fernandes, D Marques, K Pingali, P Stodghill Proceedings 20th IEEE International Parallel & Distributed Processing …, 2006 | 55 | 2006 |
Large scale debugging of parallel tasks with automaded I Laguna, T Gamblin, BR de Supinski, S Bagchi, G Bronevetsky, DH Anh, ... Proceedings of 2011 International Conference for High Performance Computing …, 2011 | 49 | 2011 |