ติดตาม
Harry Dong
Harry Dong
ยืนยันอีเมลแล้วที่ andrew.cmu.edu - หน้าแรก
ชื่อ
อ้างโดย
อ้างโดย
ปี
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
H Dong, X Yang, Z Zhang, Z Wang, Y Chi, B Chen
International Conference on Machine Learning (ICML), 2024
40*2024
Fast and provable tensor robust principal component analysis via scaled gradient descent
H Dong, T Tong, C Ma, Y Chi
Information and Inference: A Journal of the IMA 12 (3), 1716-1758, 2023
172023
Prompt-prompted Adaptive Structured Pruning for Efficient LLM Generation
H Dong, B Chen, Y Chi
Conference on Language Modeling (COLM), 2024
12*2024
Shadowkv: Kv cache in shadows for high-throughput long-context llm inference
H Sun, LW Chang, W Bao, S Zheng, N Zheng, X Liu, H Dong, Y Chi, ...
arXiv preprint arXiv:2410.21465, 2024
11*2024
Deep unfolded tensor robust PCA with self-supervised learning
H Dong, M Shah, S Donegan, Y Chi
ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and …, 2023
72023
Towards structured sparsity in transformers for efficient inference
H Dong, B Chen, Y Chi
Workshop on Efficient Systems for Foundation Models@ ICML2023, 2023
62023
A lightweight transformer for faster and robust EBSD data collection
H Dong, S Donegan, M Shah, Y Chi
Scientific Reports 13 (1), 21253, 2023
22023
Learning optimal traffic routing behaviors using Markovian framework in microscopic simulation
T Cabannes, J Li, F Wu, H Dong, AM Bayen
Transportation Review Board Annual Meeting 2020, 2020
12020
Leveraging Multimodal Diffusion Models to Accelerate Imaging with Side Information
T Efimov, H Dong, M Shah, J Simmons, S Donegan, Y Chi
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025
2025
Towards Low-bit Communication for Tensor Parallel LLM Inference
H Dong, T Johnson, M Cho, E Soroush
NeurIPS Workshop on Efficient Natural Language and Speech Processing IV, 2024
2024
ระบบไม่สามารถดำเนินการได้ในขณะนี้ โปรดลองใหม่อีกครั้งในภายหลัง
บทความ 1–10