Improving the utilization of micro-operation caches in x86 processors

JB Kotra, J Kalamatianos - 2020 53rd Annual IEEE/ACM …, 2020 - ieeexplore.ieee.org
Most modern processors employ variable length, Complex Instruction Set Computing (CISC)
instructions to reduce instruction fetch energy cost and bandwidth requirements. High …

SeqPoint: Identifying representative iterations of sequence-based neural networks

S Pati, S Aga, MD Sinclair… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org
The ubiquity of deep neural networks (DNNs) continues to rise, making them a crucial
application class for hardware optimizations. However, detailed profiling and …

Cross-Stack Optimizations for Sequence-Based Models on GPUs

S Pati - 2024 - search.proquest.com
Advancements in the field of machine learning has made deep neural networks (DNNs)
ubiquitous. Their application in the domain of natural language processing (NLP) with …

Accurate Simulation of Data Movement in Modern Mobile Multicore Systems

Q Huppert - 2022 - theses.hal.science
Computer system architectures have become increasingly complex. Pushing for better
performance and lower energy consumption, they include multiple cores, GPUs …