Rethinking TLB designs in virtualized environments: A very large part-of-memory TLB JH Ryoo, N Gulur, S Song, LK John ACM SIGARCH Computer Architecture News 45 (2), 469-480, 2017 | 99 | 2017 |
Bi-modal dram cache: Improving hit rate, hit latency and bandwidth N Gulur, M Mehendale, R Manikantan, R Govindarajan 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, 38-50, 2014 | 73 | 2014 |
CSALT: Context switch aware large TLB Y Marathe, N Gulur, JH Ryoo, S Song, LK John Proceedings of the 50th Annual IEEE/ACM International Symposium on …, 2017 | 43 | 2017 |
Multiple sub-row buffers in dram: Unlocking performance and energy improvement opportunities ND Gulur, R Manikantan, M Mehendale, R Govindarajan Proceedings of the 26th ACM international conference on Supercomputing, 257-266, 2012 | 38 | 2012 |
Anatomy: An analytical model of memory system performance N Gulur, M Mehendale, R Manikantan, R Govindarajan ACM SIGMETRICS Performance Evaluation Review 42 (1), 505-517, 2014 | 24 | 2014 |
A comprehensive analytical performance model of dram caches N Gulur, M Mehendale, R Govindarajan Proceedings of the 6th ACM/SPEC International Conference on Performance …, 2015 | 11 | 2015 |
Bi-modal dram cache: A scalable and effective die-stacked dram cache N Gulur, M Mehendale, R Manikantan, R Govindarajan Proceedings of the 47th Annual IEEE/ACM International Symposium on …, 2014 | 9 | 2014 |
MicroRefresh: Minimizing refresh overhead in DRAM caches N Gulur, R Govindarajan, M Mehendale Proceedings of the Second International Symposium on Memory Systems, 350-361, 2016 | 7 | 2016 |
Express: Simultaneously achieving storage, execution and energy efficiencies in moderately sparse matrix computations S Adavally, N Gulur, K Kavi, A Weaver, P Dutta, B Wang Proceedings of the International Symposium on Memory Systems, 46-60, 2020 | 6 | 2020 |
Row-buffer reorganization: simultaneously improving performance and reducing energy in drams N Gulur, R Manikantan, R Govindarajan, M Mehendale 2011 International Conference on Parallel Architectures and Compilation …, 2011 | 6 | 2011 |
ATTC (@ C) Addressable-TLB based Translation Coherence H Gugale, N Gulur, Y Marathe, LK John Proceedings of the ACM International Conference on Parallel Architectures …, 2020 | 3 | 2020 |
Heterogeneous architecture for sparse data processing S Adavally, A Weaver, P Vasireddy, K Kavi, G Mehta, N Gulur 2022 IEEE International Parallel and Distributed Processing Symposium …, 2022 | 2 | 2022 |
CHASM: Security evaluation of cache mapping schemes F Mosquera, N Gulur, K Kavi, G Mehta, H Sun Embedded Computer Systems: Architectures, Modeling, and Simulation: 20th …, 2020 | 2 | 2020 |
Understanding the Performance Benefit of Asynchronous Data Transfers in OpenCL Programs Executing on Media Processors N Gulur, NL Suriya 2015 IEEE 22nd International Conference on High Performance Computing (HiPC …, 2015 | 2 | 2015 |
Method to determine contrariety between architectures containing stratified memory mapped register sets V Easwaran, N Gulur, S Srirangapathi, M Mody, R Gulati, P Karandikar, ... 2014 Fifth International Symposium on Electronic System Design, 210-214, 2014 | 1 | 2014 |
Neural network processor MM Mehendale, N Gulur, SBS Chakravarthy, A Lele, H Sanghvi US Patent App. 18/355,689, 2024 | | 2024 |
Neural network processor MM Mehendale, A Lele, N Gulur, H Sanghvi, SBS Chakravarthy US Patent App. 18/355,795, 2024 | | 2024 |
Neural network processor MM Mehendale, H Sanghvi, N Gulur, A Lele, SBS Chakravarthy US Patent App. 18/355,749, 2024 | | 2024 |
Multiple sub-row buffers in DRAM ND Gulur, R Manikantan, M Mehendale, R Govindarajan Proceedings of the 26th ACM international conference on Supercomputing, 2012 | | 2012 |
IPDPSW 2022 RD Friese, JK Kim, B Shirazi, L White, S Adavally, A Weaver, G Mehta, ... | | |