- Academic Search

Articles

Scholar

1 result (0.01 sec)

My profile My library

BSViT: A Bit-Serial Vision Transformer Accelerator Exploiting Dynamic Patch and Weight Bit-Group...

Search within citing articles

[Free GPT-4]

[PDF] arxiv.org

Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format

C Fang, M Shi, R Geens, A Symons, Z Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

The widely-used, weight-only quantized large language models (LLMs), which leverage low-
bit integer (INT) weights and retain floating-point (FP) activations, reduce storage …

Save Cite Related articles View as HTML

Create alert

Cite

Advanced search

Saved to My library

BSViT: A Bit-Serial Vision Transformer Accelerator Exploiting Dynamic Patch and Weight Bit-Group...

Anda: Unlocking Efficient LLM Inference with a Variable-Length Grouped Activation Data Format