AOS: Adaptive overwrite scheme for energy-efficient MLC STT-RAM cache X Chen, N Khoshavi, J Zhou, D Huang, RF DeMara, J Wang, W Wen, ... Proceedings of the 53rd Annual Design Automation Conference, 1-6, 2016 | 100 | 2016 |
Energy-aware adaptive restore schemes for MLC STT-RAM cache X Chen, N Khoshavi, RF DeMara, J Wang, D Huang, W Wen, Y Chen IEEE Transactions on Computers 66 (5), 786-798, 2016 | 54 | 2016 |
Achieving load balance for parallel data access on distributed file systems D Huang, D Han, J Wang, J Yin, X Chen, X Zhang, J Zhou, M Ye IEEE Transactions on Computers 67 (3), 388-402, 2017 | 35 | 2017 |
Opass: Analysis and optimization of parallel data access on distributed file systems J Yin, J Wang, J Zhou, T Lukasiewicz, D Huang, J Zhang 2015 IEEE International Parallel and Distributed Processing Symposium, 623-632, 2015 | 29 | 2015 |
Identifying latent reduced models to precondition lossy compression H Luo, D Huang, Q Liu, Z Qiao, H Jiang, J Bi, H Yuan, M Zhou, J Wang, ... 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2019 | 21 | 2019 |
Optimizing massively parallel winograd convolution on arm processor D Li, D Huang, Z Chen, Y Lu Proceedings of the 50th International Conference on Parallel Processing, 1-12, 2021 | 17 | 2021 |
Harnessing data movement in virtual clusters for in-situ execution D Huang, Q Liu, S Klasky, J Wang, JY Choi, J Logan, N Podhorszki IEEE transactions on parallel and distributed systems 30 (3), 615-629, 2018 | 13 | 2018 |
Can i/o variability be reduced on qos-less hpc storage systems? D Huang, Q Liu, J Choi, N Podhorszki, S Klasky, J Logan, G Ostrouchov, ... IEEE Transactions on Computers 68 (5), 631-645, 2018 | 12 | 2018 |
Distributed software emulator for cyber-physical analysis in smart grid S Tan, W Song, D Huang, Q Dong, L Tong IEEE Transactions on Emerging Topics in Computing 5 (4), 506-517, 2014 | 12 | 2014 |
Optimizing small channel 3D convolution on GPU with tensor core J Jiang, D Huang, J Du, Y Lu, X Liao Parallel Computing 113, 102954, 2022 | 10 | 2022 |
Deister: A light-weight autonomous block management in data-intensive file systems using deterministic declustering distribution J Wang, X Zhang, J Zhang, J Yin, D Han, R Wang, D Huang Journal of Parallel and Distributed Computing 108, 3-13, 2017 | 10 | 2017 |
Persistent items tracking in large data streams based on adaptive sampling L Chen, RCW Phan, Z Chen, D Huang IEEE INFOCOM 2022-IEEE Conference on Computer Communications, 1948-1957, 2022 | 9 | 2022 |
A comprehensive study of in-memory computing on large HPC systems D Huang, Z Qin, Q Liu, N Podhorszki, S Klasky 2020 IEEE 40th International Conference on Distributed Computing Systems …, 2020 | 9 | 2020 |
Handling heavy-tailed input of transformer inference on GPUS J Du, J Jiang, Y You, D Huang, Y Lu Proceedings of the 36th ACM International Conference on Supercomputing, 1-11, 2022 | 8 | 2022 |
Identifying challenges and opportunities of in-memory computing on large HPC systems D Huang, Z Qin, Q Liu, N Podhorszki, S Klasky Journal of Parallel and Distributed Computing 164, 106-122, 2022 | 8 | 2022 |
Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs J Du, J Jiang, J Zheng, H Zhang, D Huang, Y Lu ACM Transactions on Architecture and Code Optimization 20 (4), 1-22, 2023 | 7 | 2023 |
Liger: Interleaving Intra-and Inter-Operator Parallelism for Distributed Large Model Inference J Du, J Wei, J Jiang, S Cheng, D Huang, Z Chen, Y Lu Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and …, 2024 | 6 | 2024 |
Full-stack optimizing transformer inference on ARM many-core CPU J Jiang, J Du, D Huang, Z Chen, Y Lu, X Liao IEEE Transactions on Parallel and Distributed Systems 34 (7), 2221-2235, 2023 | 6 | 2023 |
Characterizing and optimizing transformer inference on arm many-core processor J Jiang, J Du, D Huang, D Li, J Zheng, Y Lu Proceedings of the 51st International Conference on Parallel Processing, 1-11, 2022 | 6 | 2022 |
SideIO: A Side I/O system framework for hybrid scientific workflow J Wang, D Huang, H Wu, J Yin, X Zhang, X Chen, R Wang Journal of Parallel and Distributed Computing 108, 45-58, 2017 | 6 | 2017 |