Seuraa
Jianyu Huang
Jianyu Huang
Meta Platforms, Inc.
Vahvistettu sähköpostiosoite verkkotunnuksessa meta.com - Kotisivu
Nimike
Viittaukset
Viittaukset
Vuosi
The Llama 3 Herd of Models
A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ...
arXiv preprint arXiv:2407.21783, 2024
28672024
Deep Learning Recommendation Model for Personalization and Recommendation Systems
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
8452019
A Study of BFLOAT16 for Deep Learning Training
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 2019
3862019
Software-hardware co-design for fast and scalable training of deep learning recommendation models
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
Proceedings of the 49th Annual International Symposium on Computer …, 2022
1442022
The llama 3 herd of models
A Grattafiori, A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, ...
arXiv e-prints, arXiv: 2407.21783, 2024
962024
Strassen's algorithm reloaded
J Huang, TM Smith, GM Henry, RA van de Geijn
High Performance Computing, Networking, Storage and Analysis, SC16 …, 2016
932016
FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference
D Khudia, J Huang, P Basu, S Deng, H Liu, J Park, M Smelyanskiy
arXiv preprint arXiv:2101.05615, 0
56
Performance optimization for the k-nearest neighbors kernel on x86 architectures
CD Yu, J Huang, W Austin, B Xiao, G Biros
Proceedings of the International Conference for High Performance Computing …, 2015
492015
Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs/1906.00091 (2019)
M Naumov, D Mudigere, HJM Shi, J Huang, N Sundaraman, J Park, ...
arXiv preprint arXiv:1906.00091, 2019
45*2019
Generating families of practical fast matrix multiplication algorithms
J Huang, L Rice, DA Matthews, RA van de Geijn
2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2017
402017
High-performance, Distributed Training of Large-scale Deep Learning Recommendation Models
D Mudigere, Y Hao, J Huang, A Tulloch, S Sridharan, X Liu, M Ozdal, ...
arXiv preprint arXiv:2104.05158, 2021
372021
Mahmoud khorashadi, Pallab Bhattacharya, Petr Lapukhov, Maxim Naumov, Ajit Mathews, Lin Qiao, Mikhail Smelyanskiy, Bill Jia, and Vijay Rao. 2021. Software-Hardware Co-design …
D Mudigere, Y Hao, J Huang, Z Jia, A Tulloch, S Sridharan, X Liu, ...
arXiv preprint arXiv:2104.05158, 2022
342022
Mixed-Precision Embedding Using a Cache
JA Yang, J Huang, J Park, PTP Tang, A Tulloch
arXiv preprint arXiv:2010.11305, 2020
342020
Strassen’s Algorithm Reloaded on GPUs
J Huang, CD Yu, RA Geijn
ACM Transactions on Mathematical Software (TOMS) 46 (1), 1-22, 2020
282020
Implementing Strassen's Algorithm with CUTLASS on NVIDIA Volta GPUs
J Huang, CD Yu, RA van de Geijn
arXiv preprint arXiv:1808.07984, 2018
242018
Strassen's Algorithm for Tensor Contraction
J Huang, DA Matthews, RA van de Geijn
SIAM Journal on Scientific Computing 40 (3), C305-C326, 2018
242018
A study of BFLOAT16 for deep learning training (2019)
D Kalamkar, D Mudigere, N Mellempudi, D Das, K Banerjee, S Avancha, ...
arXiv preprint arXiv:1905.12322, 1905
201905
BLISlab: A Sandbox for Optimizing GEMM
J Huang, RA van de Geijn
arXiv preprint arXiv:1609.00076, 2016
162016
{AdaEmbed}: Adaptive Embedding for {Large-Scale} Recommendation Models
F Lai, W Zhang, R Liu, W Tsai, X Wei, Y Hu, S Devkota, J Huang, J Park, ...
17th USENIX Symposium on Operating Systems Design and Implementation (OSDI …, 2023
142023
Efficient soft-error detection for low-precision deep learning recommendation models
S Li, J Huang, PTP Tang, D Khudia, J Park, HD Dixit, Z Chen
2022 IEEE International Conference on Big Data (Big Data), 1556-1563, 2022
142022
Järjestelmä ei voi suorittaa toimenpidettä nyt. Yritä myöhemmin uudelleen.
Artikkelit 1–20