Följ
Dehao Chen
Dehao Chen
Verifierad e-postadress på google.com
Titel
Citeras av
Citeras av
År
Gpipe: Efficient training of giant neural networks using pipeline parallelism
Y Huang, Y Cheng, A Bapna, O Firat, D Chen, M Chen, HJ Lee, J Ngiam, ...
Advances in neural information processing systems 32, 2019
18632019
Lamda: Language models for dialog applications
R Thoppilan, D De Freitas, J Hall, N Shazeer, A Kulshreshtha, HT Cheng, ...
arXiv preprint arXiv:2201.08239, 2022
16642022
Gshard: Scaling giant models with conditional computation and automatic sharding
D Lepikhin, HJ Lee, Y Xu, D Chen, O Firat, Y Huang, M Krikun, N Shazeer, ...
arXiv preprint arXiv:2006.16668, 2020
10982020
Mlperf training benchmark
P Mattson, C Cheng, G Diamos, C Coleman, P Micikevicius, D Patterson, ...
Proceedings of Machine Learning and Systems 2, 336-349, 2020
3602020
MapCG: Writing parallel program portable between CPU and GPU
C Hong, D Chen, W Chen, W Zheng, H Lin
Proceedings of the 19th international conference on Parallel architectures …, 2010
2242010
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
2122019
Image classification at supercomputer scale
C Ying, S Kumar, D Chen, T Wang, Y Cheng
arXiv preprint arXiv:1811.06992, 2018
1622018
AutoFDO: Automatic feedback-directed optimization for warehouse-scale applications
D Chen, DX Li, T Moseley
Proceedings of the 2016 International Symposium on Code Generation and …, 2016
1332016
GSPMD: general and scalable parallelization for ML computation graphs
Y Xu, HJ Lee, D Chen, B Hechtman, Y Huang, R Joshi, M Krikun, ...
arXiv preprint arXiv:2105.04663, 2021
1302021
Renelito Delos Santos
R Thoppilan, D De Freitas, J Hall, N Shazeer, A Kulshreshtha, HT Cheng, ...
1082022
Taming hardware event samples for fdo compilation
D Chen, N Vachharajani, R Hundt, S Liao, V Ramasamy, P Yuan, W Chen, ...
Proceedings of the 8th annual IEEE/ACM international symposium on Code …, 2010
912010
Overlap communication with dependent computation via decomposition in large deep learning models
S Wang, J Wei, A Sabne, A Davis, B Ilbeyi, B Hechtman, D Chen, ...
Proceedings of the 28th ACM International Conference on Architectural …, 2022
642022
Tree partition based parallel frequent pattern mining on shared memory systems
D Chen, C Lai, W Hu, WG Chen, Y Zhang, W Zheng
Proceedings 20th IEEE International Parallel & Distributed Processing …, 2006
542006
Taming hardware event samples for precise and versatile feedback directed optimizations
D Chen, N Vachharajani, R Hundt, X Li, S Eranian, W Chen, W Zheng
IEEE Transactions on Computers 62 (2), 376-389, 2011
502011
Scale mlperf-0.6 models on google tpu-v3 pods
S Kumar, V Bitorff, D Chen, C Chou, B Hechtman, HJ Lee, N Kumar, ...
arXiv preprint arXiv:1909.09756, 2019
422019
Lamda: Language models for dialog applications
AD Cohen, A Roberts, A Molina, A Butryna, A Jin, A Kulshreshtha, ...
arXiv preprint arXiv:2201.08239, 2022
352022
Automatic cross-replica sharding of weight update in data-parallel training
Y Xu, HJ Lee, D Chen, H Choi, B Hechtman, S Wang
arXiv preprint arXiv:2004.13336, 2020
342020
Exploring the limits of Concurrency in ML Training on Google TPUs
S Kumar, Y Wang, C Young, J Bradbury, N Kumar, D Chen, A Swing
Proceedings of Machine Learning and Systems 3, 81-92, 2021
222021
Feedback-directed optimizations in gcc with estimated edge profiles from hardware event sampling
V Ramasamy, P Yuan, D Chen, R Hundt
Proceedings of GCC Summit, 87-102, 2008
222008
Compile-time feedback-directed optimizations using estimated edge profiles from hardware-event sampling
R Hundt, V Ramasamy, D Chen
US Patent 8,387,026, 2013
202013
Systemet kan inte utföra åtgärden just nu. Försök igen senare.
Artiklar 1–20