Improving the utilization of micro-operation caches in x86 processors
Most modern processors employ variable length, Complex Instruction Set Computing (CISC)
instructions to reduce instruction fetch energy cost and bandwidth requirements. High …
instructions to reduce instruction fetch energy cost and bandwidth requirements. High …
SeqPoint: Identifying representative iterations of sequence-based neural networks
The ubiquity of deep neural networks (DNNs) continues to rise, making them a crucial
application class for hardware optimizations. However, detailed profiling and …
application class for hardware optimizations. However, detailed profiling and …
Cross-Stack Optimizations for Sequence-Based Models on GPUs
S Pati - 2024 - search.proquest.com
Advancements in the field of machine learning has made deep neural networks (DNNs)
ubiquitous. Their application in the domain of natural language processing (NLP) with …
ubiquitous. Their application in the domain of natural language processing (NLP) with …
Accurate Simulation of Data Movement in Modern Mobile Multicore Systems
Q Huppert - 2022 - theses.hal.science
Computer system architectures have become increasingly complex. Pushing for better
performance and lower energy consumption, they include multiple cores, GPUs …
performance and lower energy consumption, they include multiple cores, GPUs …