עקוב אחר
Chao Wang
Chao Wang
Scientist, National Center for Computational Sciences, Oak Ridge National Laboratory
כתובת אימייל מאומתת בדומיין ornl.gov
כותרת
צוטט על ידי
צוטט על ידי
שנה
Proactive process-level live migration in HPC environments
C Wang, F Mueller, C Engelmann, SL Scott
SC'08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing, 1-12, 2008
2482008
A job pause service under LAM/MPI+ BLCR for transparent fault tolerance
C Wang, F Mueller, C Engelmann, SL Scott
2007 IEEE International Parallel and Distributed Processing Symposium, 1-10, 2007
1142007
NVMalloc: Exposing an aggregate SSD store as a memory partition in extreme-scale machines
C Wang, SS Vazhkudai, X Ma, F Meng, Y Kim, C Engelmann
2012 IEEE 26th International Parallel and Distributed Processing Symposium …, 2012
1012012
Hybrid checkpointing for MPI jobs in HPC environments
C Wang, F Mueller, C Engelmann, SL Scott
2010 IEEE 16th International Conference on Parallel and Distributed Systems …, 2010
712010
Proactive process-level live migration and back migration in HPC environments
C Wang, F Mueller, C Engelmann, SL Scott
Journal of Parallel and Distributed Computing 72 (2), 254-267, 2012
622012
Scalable, fault tolerant membership for MPI tasks on HPC systems
J Varma, C Wang, F Mueller, C Engelmann, SL Scott
Proceedings of the 20th annual international conference on Supercomputing …, 2006
442006
Optimizing center performance through coordinated data staging, scheduling and recovery
Z Zhang, C Wang, SS Vazhkudai, X Ma, GG Pike, JW Cobb, F Mueller
Proceedings of the 2007 ACM/IEEE conference on Supercomputing, 1-11, 2007
352007
Hybrid full/incremental checkpoint/restart for MPI jobs in HPC environments
C Wang, F Mueller, C Engelmann, SL Scott
International Conference on Parallel and Distributed Systems, 2011
192011
Improving the availability of supercomputer job input data using temporal replication
C Wang, Z Zhang, X Ma, SS Vazhkudai, F Mueller
Computer Science-Research and Development 23, 149-157, 2009
182009
MOLAR: Adaptive runtime support for high-end computing operating and runtime systems
C Engelmann, SL Scott, DE Bernholdt, NR Gottumukkala, C Leangsuksun, ...
ACM SIGOPS Operating Systems Review 40 (2), 63-72, 2006
172006
Understanding object-level memory access patterns across the spectrum
X Ji, C Wang, N El-Sayed, X Ma, Y Kim, SS Vazhkudai, W Xue, ...
Proceedings of the International Conference for High Performance Computing …, 2017
152017
A tunable holistic resiliency approach for high-performance computing systems
SL Scott, C Engelmann, GR Vallée, T Naughton, A Tikotekar, ...
Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of …, 2009
152009
On-the-fly recovery of job input data in supercomputers
C Wang, Z Zhang, SS Vazhkudai, X Ma, F Mueller
2008 37th International Conference on Parallel Processing, 620-627, 2008
62008
Transparent fault tolerance for job healing in HPC environments
C Wang
North Carolina State University, 2009
52009
Transparent Fault Tolerance for Job Input Data in HPC Environments
C Wang, SS Vazhkudai, X Ma, F Mueller
22014
Hybrid Full/Incremental Checkpoint/Restart for MPI Jobs in HPC Environments
W Chao, F Mueller, C Engelmann
Proc. of the 16th International Conference on Parallel and Distributed …, 2011
22011
GPFS Evaluation Report
C Wang
Technical Report, National Center for Computational Sciences, Oak Ridge …, 2016
2016
A Study on Application Heap Object-level Memory Access Patterns
X Ji, C Wang, X Ma, S Vazhkudai, Y Kim
Technical Report, National Center for Computational Sciences, Oak Ridge …, 2016
2016
Back-Migration for MPI Jobs in HPC Environments
C Wang, F Mueller, C Engelmann, SL Scott
Forum to Address Scalable Technology for Runtime and Operating Systems (FastOS), 2009
2009
Resiliency for High-Performance Computing Systems
1st High-Performance Computer Science Week (HPCSW) 2008, 2008
2008
המערכת אינה יכולה לבצע את הפעולה כעת. נסה שוב מאוחר יותר.
מאמרים 1–20