Следене
Siddharth Singh
Siddharth Singh
PhD Student, Computer Science, University of Maryland
Потвърден имейл адрес: umd.edu
Заглавие
Позовавания
Позовавания
Година
Stance detection in web and social media: a comparative study
S Ghosh, P Singhania, S Singh, K Rudra, S Ghosh
Experimental IR Meets Multilinguality, Multimodality, and Interaction: 10th …, 2019
1072019
A hybrid tensor-expert-data parallelism approach to optimize mixture-of-experts training
S Singh, O Ruwase, AA Awan, S Rajbhandari, Y He, A Bhatele
Proceedings of the 37th International Conference on Supercomputing, 203-214, 2023
212023
Be like a goldfish, don't memorize! mitigating memorization in generative llms
A Hans, J Kirchenbauer, Y Wen, N Jain, H Kazemi, P Singhania, S Singh, ...
Advances in Neural Information Processing Systems 37, 24022-24045, 2025
162025
AxoNN: An asynchronous, message-driven parallel framework for extreme-scale deep learning
S Singh, A Bhatele
2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2022
162022
Loki: Low-rank keys for efficient sparse attention
P Singhania, S Singh, S He, S Feizi, A Bhatele
Advances in Neural Information Processing Systems 37, 16692-16723, 2025
142025
A survey and empirical evaluation of parallel deep learning frameworks
D Nichols, S Singh, SH Lin, A Bhatele
arXiv preprint arXiv:2111.04949, 2021
10*2021
Inducing Cooperation in Multi-Agent Games Through Status-Quo Loss
P Badjatiya, M Sarkar, A Sinha, S Singh, N Puri, B Krishnamurthy
arXiv preprint, 2020
9*2020
Exploiting sparsity in pruned neural networks to optimize large model training
S Singh, A Bhatele
2023 IEEE International Parallel and Distributed Processing Symposium (IPDPS …, 2023
82023
PySchedCL: Leveraging Concurrency in Heterogeneous Data-Parallel Systems
A Ghose, S Singh, V Kulaharia, L Dokara, S Maity, S Dey
IEEE Transactions on Computers 71 (9), 2234-2247, 2021
42021
Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach
J Geiping, S McLeish, N Jain, J Kirchenbauer, S Singh, BR Bartoldson, ...
arXiv preprint arXiv:2502.05171, 2025
32025
A 4D Hybrid Algorithm to Scale Parallel Training to Thousands of GPUs
arXiv preprint arXiv:2305.13525, 2023
3*2023
HPC-Coder-V2: Studying Code LLMs Across Low-Resource Parallel Languages
A Chaturvedi, D Nichols, S Singh, A Bhatele
arXiv preprint arXiv:2412.15178, 2024
12024
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
S Singh, P Singhania, A Ranjan, J Kirchenbauer, J Geiping, Y Wen, ...
SC24: International Conference for High Performance Computing, Networking …, 2024
12024
Jorge: Approximate Preconditioning for GPU-efficient Second-order Optimization
S Singh, Z Sating, A Bhatele
arXiv preprint arXiv:2310.12298, 2023
12023
Gemstones: A Model Suite for Multi-Faceted Scaling Laws
S McLeish, J Kirchenbauer, DY Miller, S Singh, A Bhatele, M Goldblum, ...
arXiv preprint arXiv:2502.06857, 2025
2025
Eve: Less Memory, Same Might
A Tomar, S Singh, T Goldstein, A Bhatele
2024
Creating Code LLMs for HPC: It’s LLMs All the Way Down
A Chaturvedi, D Nichols, S Singh, A Bhatele
Memory (MB) 4011 (7228), 14927, 0
Системата не може да изпълни операцията сега. Опитайте отново по-късно.
Статии 1–17