FACET: On-the-Fly Activation Compression for Efficient Transformer Training
S Lee, G Yun, XT Nguyen… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Training Transformer models, known for their outstanding performance in various tasks, can
be challenging due to extensive training times and substantial memory requirements. One …
be challenging due to extensive training times and substantial memory requirements. One …
Balanced Data Placement for GEMV Acceleration with Processing-In-Memory
With unprecedented demand for generative AI (GenAI) inference, acceleration of primitives
that dominate GenAI such as general matrix-vector multiplication (GEMV) is receiving …
that dominate GenAI such as general matrix-vector multiplication (GEMV) is receiving …
PIMnast: Balanced Data Placement for GEMV Acceleration with Processing-In-Memory
With unprecedented demand for generative AI (GenAI) inference, acceleration of primitives
that dominate GenAI such as general matrix-vector multiplication (GEMV) is receiving …
that dominate GenAI such as general matrix-vector multiplication (GEMV) is receiving …
Cross-Stack Optimizations for Sequence-Based Models on GPUs
S Pati - 2024 - search.proquest.com
Advancements in the field of machine learning has made deep neural networks (DNNs)
ubiquitous. Their application in the domain of natural language processing (NLP) with …
ubiquitous. Their application in the domain of natural language processing (NLP) with …