Squeezellm: Dense-and-sparse quantization S Kim, C Hooper, A Gholami, Z Dong, X Li, S Shen, MW Mahoney, ... arXiv preprint arXiv:2306.07629, 2023 | 178 | 2023 |
Kvquant: Towards 10 million context length llm inference with kv cache quantization C Hooper, S Kim, H Mohammadzadeh, MW Mahoney, YS Shao, ... arXiv preprint arXiv:2401.18079, 2024 | 128* | 2024 |
Edgebert: Sentence-level energy optimizations for latency-aware multi-task nlp inference T Tambe, C Hooper, L Pentecost, T Jia, EY Yang, M Donato, V Sanh, ... MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture …, 2021 | 121 | 2021 |
Full stack optimization of transformer inference: a survey S Kim, C Hooper, T Wattanawong, M Kang, R Yan, H Genc, G Dinh, ... arXiv preprint arXiv:2302.14017, 2023 | 110 | 2023 |
Saturation velocity determination for In0.53Ga0.47As field-effect transistors S Bandy, C Nishimoto, S Hyder, C Hooper Applied Physics Letters 38 (10), 817-819, 1981 | 103 | 1981 |
S-lora: Serving thousands of concurrent lora adapters Y Sheng, S Cao, D Li, C Hooper, N Lee, S Yang, C Chou, B Zhu, L Zheng, ... arXiv preprint arXiv:2311.03285, 2023 | 93 | 2023 |
AI and memory wall A Gholami, Z Yao, S Kim, C Hooper, MW Mahoney, K Keutzer IEEE Micro, 2024 | 86 | 2024 |
9.8 A 25mm2 SoC for IoT Devices with 18ms Noise-Robust Speech-to-Text Latency via Bayesian Speech Denoising and Attention-Based Sequence-to-Sequence … T Tambe, EY Yang, GG Ko, Y Chai, C Hooper, M Donato, PN Whatmough, ... 2021 IEEE International Solid-State Circuits Conference (ISSCC) 64, 158-160, 2021 | 44 | 2021 |
Vapor-phase epitaxial growth of quaternary In1-xGaxAsyP1-y in the 0.75-1.35-eV band-gap range SB Hyder, RR Saxena, CC Hooper Applied Physics Letters 34 (9), 584, 1979 | 25 | 1979 |
Speed: Speculative pipelined execution for efficient decoding C Hooper, S Kim, H Mohammadzadeh, H Genc, K Keutzer, A Gholami, ... arXiv preprint arXiv:2310.12072, 2023 | 21 | 2023 |
22.9 A 12nm 18.1 TFLOPs/W sparse transformer processor with entropy-based early exit, mixed-precision predication and fine-grained power management T Tambe, J Zhang, C Hooper, T Jia, PN Whatmough, J Zuckerman, ... 2023 IEEE International Solid-State Circuits Conference (ISSCC), 342-344, 2023 | 21 | 2023 |
A 16-nm soc for noise-robust speech and nlp edge ai inference with bayesian sound source separation and attention-based dnns T Tambe, EY Yang, GG Ko, Y Chai, C Hooper, M Donato, PN Whatmough, ... IEEE Journal of Solid-State Circuits 58 (2), 569-581, 2022 | 18 | 2022 |
Tinyagent: Function calling at the edge LE Erdogan, N Lee, S Jha, S Kim, R Tabrizi, S Moon, C Hooper, ... arXiv preprint arXiv:2409.00608, 2024 | 9 | 2024 |
Property-aware multi-speaker data simulation: A probabilistic modelling technique for synthetic data generation TJ Park, H Huang, C Hooper, N Koluguri, K Dhawan, A Jukic, J Balam, ... arXiv preprint arXiv:2310.12371, 2023 | 7 | 2023 |
Squeezed attention: Accelerating long context length llm inference C Hooper, S Kim, H Mohammadzadeh, M Maheswaran, J Paik, ... arXiv preprint arXiv:2411.09688, 2024 | 3 | 2024 |
Learned best-effort llm serving S Jha, C Hooper, X Liu, S Kim, K Keutzer arXiv preprint arXiv:2401.07886, 2024 | 2 | 2024 |
SM6: A 16nm System-on-Chip for Accurate and Noise-Robust Attention-Based NLP Applications : The 33rd Hot Chips Symposium – August 22-24, 2021 T Tambe, EY Yang, GG Ko, Y Chai, C Hooper, M Donato, PN Whatmough, ... 2021 IEEE Hot Chips 33 Symposium (HCS), 1-13, 2021 | 2 | 2021 |
Edgebert: Sentence-level energy optimizations for latencyaware multi-task nlp inference. MICRO-54: 54th Annual IEEE T Tambe, C Hooper, L Pentecost, T Jia, EY Yang, M Donato, V Sanh, ... ACM International Symposium on Microarchitecture, 2021 | 2 | 2021 |
Quantifying and maximizing the benefits of back-end noise adaption on attention-based speech recognition models C Hooper, T Tambe, GY Wei arXiv preprint arXiv:2105.01134, 2021 | 1 | 2021 |
ETS: Efficient Tree Search for Inference-Time Scaling C Hooper, S Kim, S Moon, K Dilmen, M Maheswaran, N Lee, ... arXiv preprint arXiv:2502.13575, 2025 | | 2025 |