Astra-sim: Enabling sw/hw co-design exploration for distributed dl training platforms S Rashidi, S Sridharan, S Srinivasan, T Krishna 2020 IEEE International Symposium on Performance Analysis of Systems and …, 2020 | 59 | 2020 |
Enabling compute-communication overlap in distributed deep learning training platforms S Rashidi, M Denton, S Sridharan, S Srinivasan, A Suresh, J Nie, ... 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture …, 2021 | 44 | 2021 |
Astra-sim2. 0: Modeling hierarchical networks and disaggregated systems for large-model training at scale W Won, T Heo, S Rashidi, S Sridharan, S Srinivasan, T Krishna 2023 IEEE International Symposium on Performance Analysis of Systems and …, 2023 | 30 | 2023 |
Themis: A network bandwidth-aware collective scheduling policy for distributed training of dl models S Rashidi, W Won, S Srinivasan, S Sridharan, T Krishna Proceedings of the 49th Annual International Symposium on Computer …, 2022 | 28 | 2022 |
Improving MLC PCM performance through relaxed write and read for intermediate resistance levels S Rashidi, M Jalili, H Sarbazi-Azad ACM Transactions on Architecture and Code Optimization (TACO) 15 (1), 1-31, 2018 | 27 | 2018 |
A survey on pcm lifetime enhancement schemes S Rashidi, M Jalili, H Sarbazi-Azad ACM Computing Surveys (CSUR) 52 (4), 1-38, 2019 | 23 | 2019 |
Scalable distributed training of recommendation models: An astra-sim+ ns3 case-study with tcp/ip transport S Rashidi, P Shurpali, S Sridharan, N Hassani, D Mudigere, K Nair, ... 2020 IEEE Symposium on High-Performance Interconnects (HOTI), 33-42, 2020 | 11 | 2020 |
Efficient distributed inference of deep neural networks via restructuring and pruning A Abdi, S Rashidi, F Fekri, T Krishna Proceedings of the AAAI Conference on Artificial Intelligence 37 (6), 6640-6648, 2023 | 10* | 2023 |
Impact of RoCE congestion control policies on distributed training of DNNs T Khan, S Rashidi, S Sridharan, P Shurpali, A Akella, T Krishna 2022 IEEE Symposium on High-Performance Interconnects (HOTI), 39-48, 2022 | 9 | 2022 |
Chakra: Advancing performance benchmarking and co-design using standardized execution traces S Sridharan, T Heo, L Feng, Z Wang, M Bergeron, W Fu, S Zheng, ... arXiv preprint arXiv:2305.14516, 2023 | 7 | 2023 |
COMET: A comprehensive cluster design methodology for distributed deep learning training DK Kadiyala, S Rashidi, T Heo, AR Bambhaniya, T Krishna, A Daglis arXiv preprint arXiv:2211.16648, 2022 | 5 | 2022 |
LIBRA: Enabling Workload-Aware Multi-Dimensional Network Topology Optimization for Distributed Training of Large AI Models W Won, S Rashidi, S Srinivasan, T Krishna 2024 IEEE International Symposium on Performance Analysis of Systems and …, 2024 | 3 | 2024 |
Exploring multi-dimensional hierarchical network topologies for efficient distributed training of trillion parameter dl models W Won, S Rashidi, S Srinivasan, T Krishna arXiv preprint arXiv:2109.11762, 2021 | 2 | 2021 |
Fred: Flexible reduction-distribution interconnect and communication implementation for wafer-scale distributed training of DNN models S Rashidi, W Won, S Srinivasan, P Gupta, T Krishna arXiv preprint arXiv:2406.19580, 2024 | 1 | 2024 |
Exploring Memory Expansion Designs for Training Mixture-of-Experts Models T Heo, S Rashidi, C Man, DK Kadiyala, W Won, S Srinivasan, ... Workshop on Hot Topics in System Infrastructure (HotInfra), 2023 | 1 | 2023 |
Leveraging Memory Expansion to Accelerate Large-Scale DL Training D Kadiyala, S Rashidi, T Heo, A Bambhaniya, T Krishna, A Daglis 2024 IEEE International Symposium on Performance Analysis of Systems and …, 2024 | | 2024 |
HW-SW Methods for Modeling and Optimizing Communication for Scalable Training of Deep Learning Models S Rashidi Georgia Institute of Technology, 2023 | | 2023 |
Xin, Yao 21 Xu, Yang 21 A Yan, K Al-hemyari, J Carretero, A Cascajo, CC Chen, Y Chen, ... | | |