Google Наука

A computational stack for cross-domain acceleration

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Mitigating Write Disturbance in Non-Volatile Memory via Coupling Machine Learning with Out-of-Place Updates

R Wu, Z Shen, Z Yang, J Shu - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Non-volatile memory (NVM) opens up new opportunities to resolve scaling restrictions of
main memory, yet it is still hindered by the write disturbance (WD) problem. The WD problem …

Запазване Позоваване С позовавания в 4 Сродни статии Всички 2 версии

[Free GPT-4]
[DeepSeek]

[PDF] ucsd.edu

VeriGOOD-ML: An open-source flow for automated ML hardware synthesis

H Esmaeilzadeh, S Ghodrati, J Gu… - 2021 IEEE/ACM …, 2021 - ieeexplore.ieee.org

This paper introduces VeriGOOD-ML, an automated methodology for generating Verilog
with no human in the loop, starting from a high-level description of a machine learning (ML) …

Запазване Позоваване С позовавания в 26 Сродни статии Всички 12 версии

A reschedulable dataflow-SIMD execution for increased utilization in CGRA cross-domain acceleration

C Yin, N **g, J Jiang, Q Wang… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org

When a coarse-grained reconfigurable array (CGRA) architecture shifts toward cross-
domain acceleration, control flow and memory accesses often degrade the processing …

Запазване Позоваване С позовавания в 15 Сродни статии Всички 2 версии

[Free GPT-4]
[DeepSeek]

[PDF] umn.edu

Energy-efficient hardware acceleration of shallow machine learning applications

Z Zeng, SS Sapatnekar - 2023 Design, Automation & Test in …, 2023 - ieeexplore.ieee.org

ML accelerators have largely focused on building general platforms for deep neural
networks (DNNs), but less so on shallow machine learning (SML) algorithms. This paper …

Запазване Позоваване С позовавания в 8 Сродни статии Всички 9 версии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

J Cho, M Kim, H Choi, G Heo… - 2024 IEEE International …, 2024 - ieeexplore.ieee.org

Recently, there has been an extensive research effort in building efficient large language
model (LLM) inference serving systems. These efforts not only include innovations in the …

Запазване Позоваване С позовавания в 3 Сродни статии Всички 6 версии

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

End-to-end synthesis of dynamically controlled machine learning accelerators

S Curzel, NB Agostini, VG Castellana… - IEEE Transactions …, 2022 - ieeexplore.ieee.org

Edge systems are required to autonomously make real-time decisions based on large
quantities of input data under strict power, performance, area, and other constraints. Meeting …

Запазване Позоваване С позовавания в 6 Сродни статии Всички 5 версии

[Free GPT-4]
[DeepSeek]

[PDF] acm.org Full View

Improving utilization of dataflow unit for multi-batch processing

Z Fan, W Li, Z Wang, Y Yang, X Ye, D Fan… - ACM Transactions on …, 2024 - dl.acm.org

Dataflow architectures can achieve much better performance and higher efficiency than
general-purpose core, approaching the performance of a specialized design while retaining …

Запазване Позоваване С позовавания в 2 Сродни статии

A 28-nm Software-Defined Accelerator Chip With Circuit-Pipeline Scaling and Intrinsic Physical Unclonable Function Enabling Secure Configuration

J Zhu, B Yang, L Chen, J Chen, Y Zhang… - IEEE Journal of Solid …, 2025 - ieeexplore.ieee.org

As emerging applications raise ever-boosting and varying computational demand, the
reconfigurable accelerator is becoming prevalent due to balanced performance, efficiency …

Запазване Позоваване Сродни статии

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Workload-Aware Hardware Accelerator Mining for Distributed Deep Learning Training

M Adnan, A Phanishayee, J Kulkarni, PJ Nair… - arxiv preprint arxiv …, 2024 - arxiv.org

In this paper, we present a novel technique to search for hardware architectures of
accelerators optimized for end-to-end training of deep neural networks (DNNs). Our …

Запазване Позоваване Сродни статии Всички 4 версии Във вид на HTML

APPEND: Rethinking ASIP Synthesis in the Era of AI

C Li, Y Wang, H Li, Y Han - 2023 60th ACM/IEEE Design …, 2023 - ieeexplore.ieee.org

Application-specific instruction-set processors (ASIP) has been widely used to speedup
specific applications based on general-purpose processor (CPU) ISA-extension and …

Запазване Позоваване С позовавания в 2 Сродни статии

Създаване на сигнал

Позоваване

Разширено търсене

Запазено в „Моята библиотека“

A computational stack for cross-domain acceleration

Mitigating Write Disturbance in Non-Volatile Memory via Coupling Machine Learning with Out-of-Place Updates

VeriGOOD-ML: An open-source flow for automated ML hardware synthesis

A reschedulable dataflow-SIMD execution for increased utilization in CGRA cross-domain acceleration

Energy-efficient hardware acceleration of shallow machine learning applications

LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale

End-to-end synthesis of dynamically controlled machine learning accelerators

Improving utilization of dataflow unit for multi-batch processing

A 28-nm Software-Defined Accelerator Chip With Circuit-Pipeline Scaling and Intrinsic Physical Unclonable Function Enabling Secure Configuration

Workload-Aware Hardware Accelerator Mining for Distributed Deep Learning Training

APPEND: Rethinking ASIP Synthesis in the Era of AI