A survey of resource-efficient llm and multimodal foundation models M Xu, W Yin, D Cai, R Yi, D Xu, Q Wang, B Wu, Y Zhao, C Yang, S Wang, ... arXiv preprint arXiv:2401.08092, 2024 | 89 | 2024 |
Mandheling: Mixed-precision on-device dnn training with dsp offloading D Xu, M Xu, Q Wang, S Wang, Y Ma, K Huang, G Huang, X Jin, X Liu Proceedings of the 28th Annual International Conference on Mobile Computing …, 2022 | 54 | 2022 |
Llmcad: Fast and scalable on-device large language model inference D Xu, W Yin, X Jin, Y Zhang, S Wei, M Xu, X Liu arXiv preprint arXiv:2309.04255, 2023 | 49 | 2023 |
Empowering 1000 tokens/second on-device llm prefilling with mllm-npu D Xu, H Zhang, L Yang, R Liu, G Huang, M Xu, X Liu arXiv preprint arXiv:2407.05858, 2024 | 13 | 2024 |
Elms: Elasticized large language models on mobile devices W Yin, R Yi, D Xu, G Huang, M Xu, X Liu arXiv preprint arXiv:2409.09071, 2024 | 5 | 2024 |
Towards Energy-efficient Federated Learning via INT8-based Training on Mobile DSPs J Yuan, S Wang, H Li, D Xu, Y Li, M Xu, X Liu Proceedings of the ACM on Web Conference 2024, 2786-2794, 2024 | 5 | 2024 |
Wip: Efficient llm prefilling with mobile npu D Xu, H Zhang, L Yang, R Liu, M Xu, X Liu Proceedings of the Workshop on Edge and Mobile Foundation Models, 33-35, 2024 | 4 | 2024 |
SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge Servers D Xu, M Xu, C Lou, L Zhang, G Huang, X Jin, X Liu Proceedings of the 29th ACM International Conference on Architectural …, 2024 | 3 | 2024 |
Fast On-device LLM Inference with NPUs D Xu, H Zhang, L Yang, R Liu, G Huang, M Xu, X Liu Proceedings of the 30th ACM International Conference on Architectural …, 2025 | 1 | 2025 |
Efficient, Scalable, and Sustainable DNN Training on SoC-Clustered Edge Servers M Xu, D Xu, C Lou, L Zhang, G Huang, X Jin, X Liu IEEE Transactions on Mobile Computing, 2024 | 1 | 2024 |
Niagara: Scheduling DNN Inference Services on Heterogeneous Edge Processors D Xu, Q Li, M Xu, K Huang, G Huang, S Wang, X Jin, Y Ma, X Liu International Conference on Service-Oriented Computing, 67-85, 2023 | 1 | 2023 |
EdgeLLM: Fast On-device LLM Inference with Speculative Decoding D Xu, W Yin, H Zhang, X Jin, Y Zhang, S Wei, M Xu, X Liu IEEE Transactions on Mobile Computing, 2024 | | 2024 |
PieBridge: Fast and Parameter-Efficient On-Device Training via Proxy Networks W Yin, D Xu, G Huang, Y Zhang, S Wei, M Xu, X Liu Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems …, 2024 | | 2024 |
Peking University, Beijing 1000871, China {liqingpostdoc, xudaliang}@ pku. edu. cn Q Li, D Xu Service-oriented Computing--ICSOC 2023 Workshops: AI-PA, ASOCA, SAPD, SQS …, 2024 | | 2024 |
Satellite Computing: From Space to Your Screen Q Li, D Xu International Conference on Service-Oriented Computing, 343-349, 2023 | | 2023 |
S3Library: Automatically Eliminating C/C++ Buffer Overflow using Compatible Safer Libraries K Sun, D Xu, D Chen, X Cheng, D Tong arXiv preprint arXiv:2004.09062, 2020 | | 2020 |
DangKiller: Eliminating Dangling Pointers Efficiently via Implicit Identifier D Xu, D Chen, C Yang, X Cheng, D Tong arXiv preprint arXiv:2003.00175, 2020 | | 2020 |
Saturation Memory Access: Mitigating Memory Spatial Errors without Terminating Programs D Chen, D Xu, D Tong, K Sun, X Guan, C Yang, X Cheng arXiv preprint arXiv:2002.02831, 2020 | | 2020 |