Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Review of chiplet-based design: system architecture and interconnection
Y Liu, X Li, S Yin - Science China Information Sciences, 2024 - Springer
Chiplet-based design, which breaks a system into multiple smaller dice (or “chiplets”) and
reassembles them into a new system chip through advanced packaging, has received …
reassembles them into a new system chip through advanced packaging, has received …
Adapt-noc: A flexible network-on-chip design for heterogeneous manycore architectures
The increased computational capability in heterogeneous manycore architectures facilitates
the concurrent execution of many applications. This requires, among other things, a flexible …
the concurrent execution of many applications. This requires, among other things, a flexible …
On-chip communication network for efficient training of deep convolutional networks on heterogeneous manycore systems
Convolutional Neural Networks (CNNs) have shown a great deal of success in diverse
application domains including computer vision, speech recognition, and natural language …
application domains including computer vision, speech recognition, and natural language …
Learning-based application-agnostic 3D NoC design for heterogeneous manycore systems
The rising use of deep learning and other big-data algorithms has led to an increasing
demand for hardware platforms that are computationally powerful, yet energy-efficient. Due …
demand for hardware platforms that are computationally powerful, yet energy-efficient. Due …
A versatile and flexible chiplet-based system design for heterogeneous manycore architectures
Heterogeneous manycore architectures are deployed to simultaneously run multiple and
diverse applications. This requires various computing capabilities (CPUs, GPUs, and …
diverse applications. This requires various computing capabilities (CPUs, GPUs, and …
Opportunistic computing in gpu architectures
Data transfer overhead between computing cores and memory hierarchy has been a
persistent issue for von Neumann architectures and the problem has only become more …
persistent issue for von Neumann architectures and the problem has only become more …
Morpheus: Extending the last level cache capacity in GPU systems using idle GPU core resources
Graphics Processing Units (GPUs) are widely-used accelerators for data-parallel
applications. In many GPU applications, GPU memory bandwidth bottlenecks performance …
applications. In many GPU applications, GPU memory bandwidth bottlenecks performance …
OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architectures
As we integrate data-parallel GPUs with general-purpose CPUs on a single chip, the
enormous cache traffic generated by GPUs will not only exhaust the limited cache capacity …
enormous cache traffic generated by GPUs will not only exhaust the limited cache capacity …
LTRF: Enabling high-capacity register files for GPUs via hardware/software cooperative register prefetching
Graphics Processing Units (GPUs) employ large register files to accommodate all active
threads and accelerate context switching. Unfortunately, register files are a scalability …
threads and accelerate context switching. Unfortunately, register files are a scalability …
A survey of architectural approaches for improving GPGPU performance, programmability and heterogeneity
With the skyrocketing advances of process technology, the increased need to process huge
amount of data, and the pivotal need for power efficiency, the usage of Graphics Processing …
amount of data, and the pivotal need for power efficiency, the usage of Graphics Processing …