Orion: Interference-aware, fine-grained GPU sharing for ML applications F Strati, X Ma, A Klimovic Proceedings of the Nineteenth European Conference on Computer Systems, 1075-1092, 2024 | 31 | 2024 |
D\'ej\aVu: KV-cache Streaming for Fast, Fault-tolerant Generative LLM Serving F Strati, S Mcallister, A Phanishayee, J Tarnawski, A Klimovic arXiv preprint arXiv:2403.01876, 2024 | 24 | 2024 |
An adaptive concurrent priority queue for numa architectures F Strati, C Giannoula, D Siakavaras, G Goumas, N Koziris Proceedings of the 16th ACM International Conference on Computing Frontiers …, 2019 | 11 | 2019 |
ML training with Cloud GPU shortages: Is cross-region the answer? F Strati, P Elvinger, T Kerimoglu, A Klimovic Proceedings of the 4th Workshop on Machine Learning and Systems, 107-116, 2024 | 10 | 2024 |
Exploring learning rate scaling rules for distributed ML training on transient resources J André, F Strati, A Klimovic Proceedings of the 3rd International Workshop on Distributed Machine …, 2022 | 6 | 2022 |
Towards a platform and benchmark suite for model training on dynamic datasets M Böther, F Strati, V Gsteiger, A Klimovic Proceedings of the 3rd Workshop on Machine Learning and Systems, 8-17, 2023 | 4 | 2023 |
PCcheck: Persistent Concurrent Checkpointing for ML F Strati, M Friedman, A Klimovic Proceedings of the 30th ACM International Conference on Architectural …, 2025 | | 2025 |
Measuring GPU utilization one level deeper P Elvinger, F Strati, NE Jerger, A Klimovic arXiv preprint arXiv:2501.16909, 2025 | | 2025 |
SmartPQ: An Adaptive Concurrent Priority Queue for NUMA Architectures C Giannoula, F Strati, D Siakavaras, G Goumas, N Koziris arXiv preprint arXiv:2406.06900, 2024 | | 2024 |