Scalable bayesian optimization using deep neural networks J Snoek, O Rippel, K Swersky, R Kiros, N Satish, N Sundaram, M Patwary, ... International conference on machine learning, 2171-2180, 2015 | 1371 | 2015 |
Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU VW Lee, C Kim, J Chhugani, M Deisher, D Kim, AD Nguyen, N Satish, ... Proceedings of the 37th annual international symposium on Computer …, 2010 | 1227 | 2010 |
Designing efficient sorting algorithms for manycore GPUs N Satish, M Harris, M Garland 2009 IEEE International Symposium on Parallel & Distributed Processing, 1-10, 2009 | 932 | 2009 |
Graphicionado: A high-performance and energy-efficient accelerator for graph analytics TJ Ham, L Wu, N Sundaram, N Satish, M Martonosi 2016 49th annual IEEE/ACM international symposium on microarchitecture …, 2016 | 450 | 2016 |
Sort vs. hash revisited: Fast join implementation on modern multi-core CPUs C Kim, T Kaldewey, VW Lee, E Sedlar, AD Nguyen, N Satish, J Chhugani, ... Proceedings of the VLDB Endowment 2 (2), 1378-1389, 2009 | 447 | 2009 |
FAST: fast architecture sensitive tree search on modern CPUs and GPUs C Kim, J Chhugani, N Satish, E Sedlar, AD Nguyen, T Kaldewey, VW Lee, ... Proceedings of the 2010 ACM SIGMOD International Conference on Management of …, 2010 | 444 | 2010 |
Clearpath: highly parallel collision avoidance for multi-agent simulation SJ Guy, J Chhugani, C Kim, N Satish, M Lin, D Manocha, P Dubey Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer …, 2009 | 437 | 2009 |
Graphmat: High performance graph analytics made productive N Sundaram, NR Satish, MMA Patwary, SR Dulloor, SG Vadlamudi, ... arXiv preprint arXiv:1503.07241, 2015 | 412 | 2015 |
3.5-D blocking optimization for stencil computations on modern CPUs and GPUs A Nguyen, N Satish, J Chhugani, C Kim, P Dubey SC'10: Proceedings of the 2010 ACM/IEEE International Conference for High …, 2010 | 403 | 2010 |
Glow: Graph lowering compiler techniques for neural networks N Rotem, J Fix, S Abdulrasool, G Catron, S Deng, R Dzhabarov, N Gibson, ... arXiv preprint arXiv:1805.00907, 2018 | 351 | 2018 |
Dyser: Unifying functionality and parallelism specialization for energy-efficient computing V Govindaraju, CH Ho, T Nowatzki, J Chhugani, N Satish, ... IEEE Micro 32 (5), 38-51, 2012 | 328 | 2012 |
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort N Satish, C Kim, J Chhugani, AD Nguyen, VW Lee, D Kim, P Dubey Proceedings of the 2010 ACM SIGMOD International Conference on Management of …, 2010 | 326 | 2010 |
Data tiering in heterogeneous memory systems SR Dulloor, A Roy, Z Zhao, N Sundaram, N Satish, R Sankaran, ... Proceedings of the Eleventh European Conference on Computer Systems, 1-16, 2016 | 281 | 2016 |
Navigating the maze of graph analytics frameworks using massive graph datasets N Satish, N Sundaram, MMA Patwary, J Seo, J Park, MA Hassaan, ... Proceedings of the 2014 ACM SIGMOD international conference on Management of …, 2014 | 252 | 2014 |
Deep learning inference in facebook data centers: Characterization, performance optimizations and hardware implications J Park, M Naumov, P Basu, S Deng, A Kalaiah, D Khudia, J Law, P Malani, ... arXiv preprint arXiv:1811.09886, 2018 | 229 | 2018 |
IMP: Indirect memory prefetcher X Yu, CJ Hughes, N Satish, S Devadas Proceedings of the 48th International Symposium on Microarchitecture, 178-190, 2015 | 198 | 2015 |
Fast updates on read-optimized databases using multi-core CPUs J Krueger, C Kim, M Grund, N Satish, D Schwalb, J Chhugani, H Plattner, ... arXiv preprint arXiv:1109.6885, 2011 | 193 | 2011 |
Streaming similarity search over one billion tweets using parallel locality-sensitive hashing N Sundaram, A Turmukhametova, N Satish, T Mostak, P Indyk, S Madden, ... Proceedings of the VLDB Endowment 6 (14), 1930-1941, 2013 | 179 | 2013 |
Can traditional programming bridge the ninja performance gap for parallel computing applications? N Satish, C Kim, J Chhugani, H Saito, R Krishnaiyer, M Smelyanskiy, ... ACM SIGARCH Computer Architecture News 40 (3), 440-451, 2012 | 150 | 2012 |
PALM: Parallel architecture-friendly latch-free modifications to B+ trees on many-core processors J Sewall, J Chhugani, C Kim, N Satish, P Dubey Proceedings of the VLDB Endowment 4 (11), 795-806, 2011 | 149 | 2011 |