Approximate computing survey, Part II: Application-specific & architectural approximation techniques and applications
The challenging deployment of compute-intensive applications from domains such as
Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces the community of …
Artificial Intelligence (AI) and Digital Signal Processing (DSP), forces the community of …
Olive: Accelerating large language models via hardware-friendly outlier-victim pair quantization
Transformer-based large language models (LLMs) have achieved great success with the
growing model size. LLMs' size grows by 240× every two years, which outpaces the …
growing model size. LLMs' size grows by 240× every two years, which outpaces the …
Rptq: Reorder-based post-training quantization for large language models
A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
The rapid development of large language models (LLMs) has significantly transformed the
field of artificial intelligence, demonstrating remarkable capabilities in natural language …
field of artificial intelligence, demonstrating remarkable capabilities in natural language …
vtensor: Flexible virtual tensor management for efficient llm serving
Large Language Models (LLMs) are widely used across various domains, processing
millions of daily requests. This surge in demand poses significant challenges in optimizing …
millions of daily requests. This surge in demand poses significant challenges in optimizing …