The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 2565 | 2024 |
Scalpel: Customizing dnn pruning to the underlying hardware parallelism J Yu, A Lukefahr, D Palframan, G Dasika, R Das, S Mahlke Proceedings of the 44th Annual International Symposium on Computer …, 2017 | 490 | 2017 |
Bit prudent in-cache acceleration of deep convolutional neural networks X Wang, J Yu, C Augustine, R Iyer, R Das 2019 IEEE International Symposium on High Performance Computer Architecture …, 2019 | 56 | 2019 |
First-generation inference accelerator deployment at facebook M Anderson, B Chen, S Chen, S Deng, J Fix, M Gschwind, A Kalaiah, ... arXiv preprint arXiv:2107.04140, 2021 | 34 | 2021 |
Compute-capable block RAMs for efficient deep learning acceleration on FPGAs X Wang, V Goyal, J Yu, V Bertacco, A Boutros, E Nurvitadhi, C Augustine, ... 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom …, 2021 | 29 | 2021 |
Systems and devices for formatting neural network parameters YU Jiecao, A Lukefahr, D Palframan, G Dasika, R Das, S Mahlke US Patent 11,275,996, 2022 | 26 | 2022 |
Tf-net: Deploying sub-byte deep neural networks on microcontrollers J Yu, A Lukefahr, R Das, S Mahlke ACM Transactions on Embedded Computing Systems (TECS) 18 (5s), 1-21, 2019 | 26 | 2019 |
Spatial-winograd pruning enabling sparse winograd convolution J Yu, J Park, M Naumov arXiv preprint arXiv:1901.02132, 2019 | 13 | 2019 |
Alternate model growth and pruning for efficient training of recommendation systems X Du, B Bhushanam, J Yu, D Choudhary, T Gao, S Wong, L Feng, J Park, ... 2021 20th IEEE International Conference on Machine Learning and Applications …, 2021 | 8 | 2021 |
Adaptive dense-to-sparse paradigm for pruning online recommendation system with non-stationary data M Ye, D Choudhary, J Yu, E Wen, Z Chen, J Yang, J Park, Q Liu, ... arXiv preprint arXiv:2010.08655, 2020 | 8 | 2020 |
BitSET: Bit-serial early termination for computation reduction in convolutional neural networks Y Pan, J Yu, A Lukefahr, R Das, S Mahlke ACM Transactions on Embedded Computing Systems 22 (5s), 1-24, 2023 | 7 | 2023 |
Systems and devices for compressing neural network parameters YU Jiecao, A Lukefahr, D Palframan, G Dasika, R Das, S Mahlke US Patent 11,321,604, 2022 | 4 | 2022 |
Controlling transition between using first and second processing circuitry A Lukefahr, S Padmanabha, R Das, S Mahlke, YU Jiecao US Patent 10,310,858, 2019 | | 2019 |
Efficient Deep Neural Network Computation on Processors J Yu | | 2019 |
Adaptive Cache Partitioning on a Composite Core J Yu, A Lukefahr, S Padmanabha, R Das, S Mahlke | | 2015 |
Alternate Model Growth and Pruning for Efficient Training of Recommendation Systems X Du12, B Bhushanam, J Yu, D Choudhary, T Gao, S Wong, L Feng, ... | | |
Retrospective: Scalpel: Customizing DNN Pruning to the Underlying Hardware Parallelism J Yu, R Das, S Mahlke | | |