The llama 3 herd of models A Dubey, A Jauhri, A Pandey, A Kadian, A Al-Dahle, A Letman, A Mathur, ... arXiv preprint arXiv:2407.21783, 2024 | 2527* | 2024 |
Chimera: Collaborative preemption for multitasking on a shared GPU JJK Park, Y Park, S Mahlke ACM SIGARCH Computer Architecture News 43 (1), 593-606, 2015 | 219 | 2015 |
Dynamic resource management for efficient utilization of multitasking GPUs JJK Park, Y Park, S Mahlke Proceedings of the twenty-second international conference on architectural …, 2017 | 99 | 2017 |
Libra: Tailoring simd execution using heterogeneous hardware and dynamic configurability Y Park, JJK Park, H Park, S Mahlke 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, 84-95, 2012 | 44 | 2012 |
Efficient performance scaling of future CGRAs for mobile applications Y Park, JJK Park, S Mahlke 2012 International Conference on Field-Programmable Technology, 335-342, 2012 | 23 | 2012 |
Improving GPU multitasking efficiency using dynamic resource sharing J Kim, J Cha, JJK Park, D Jeon, Y Park IEEE Computer Architecture Letters 18 (1), 1-5, 2018 | 21 | 2018 |
ELF: Maximizing memory-level parallelism for GPUs with coordinated warp and fetch scheduling JJK Park, Y Park, S Mahlke Proceedings of the International Conference for High Performance Computing …, 2015 | 15 | 2015 |
A bypass first policy for energy-efficient last level caches JJK Park, Y Park, S Mahlke 2016 International Conference on Embedded Computer Systems: Architectures …, 2016 | 11 | 2016 |
Efficient execution of augmented reality applications on mobile programmable accelerators JJK Park, Y Park, S Mahlke 2013 International Conference on Field-Programmable Technology (FPT), 176-183, 2013 | 11 | 2013 |
Fine grain cache partitioning using per-instruction working blocks JJK Park, Y Park, S Mahlke 2015 International Conference on Parallel Architecture and Compilation (PACT …, 2015 | 5 | 2015 |
Parameter Caching for Neural Network Accelerators J Liu, DH Woo, JJK Park, R Ashok US Patent App. 16/971,595, 2022 | 4 | 2022 |
A 40 Mbps H. 264/AVC CAVLC decoder using a 64-bit multiple-issue video parsing coprocessor S Choi, JJK Park, M Koo, D Kim, SI Chae 23rd IEEE International SOC Conference, 105-108, 2010 | 4 | 2010 |
Preemption in a machine learning hardware accelerator T Fadelu, R Narayanaswami, J Min, D Li, S Gupta, JJK Park US Patent App. 18/036,506, 2023 | 1 | 2023 |
Method and apparatus for selecting preemption technique JJK Park, S Mahlke, YOO Donghoon US Patent 9,898,333, 2018 | 1 | 2018 |
Efficiently performing inference computations of a fully convolutional network for inputs with different sizes T Kumar, SA HALAMBI, JJK Park, A Chauhan, DH Woo US Patent 2,021,056,418, 2023 | | 2023 |
Enabling Efficient Resource Utilization on Multitasking Throughput Processors. JJK Park | | 2016 |