A comprehensive survey on smartnics: Architectures, development models, applications, and research directions

EF Kfoury, S Choueiri, A Mazloum, A AlSabeh… - IEEE …, 2024‏ - ieeexplore.ieee.org
The end of Moore's Law and Dennard Scaling has slowed processor improvements in the
past decade. While multi-core processors have improved performance, they are limited by …

The nanopu: A nanosecond network stack for datacenters

S Ibanez, A Mallery, S Arslan, T Jepsen… - … on Operating Systems …, 2021‏ - usenix.org
We present the nanoPU, a new NIC-CPU co-design to accelerate an increasingly pervasive
class of datacenter applications: those that utilize many small Remote Procedure Calls …

Xenic: SmartNIC-accelerated distributed transactions

HN Schuh, W Liang, M Liu, J Nelson… - Proceedings of the …, 2021‏ - dl.acm.org
High-performance distributed transactions require efficient remote operations on database
memory and protocol metadata. The high communication cost of this workload calls for …

{PANIC}: A {High-Performance} programmable {NIC} for multi-tenant networks

J Lin, K Patel, BE Stephens, A Sivaraman… - … USENIX Symposium on …, 2020‏ - usenix.org
Programmable NICs have diverse uses, and there is need for a NIC platform that can offload
computation from multiple co-resident applications to many different types of substrates …

Dagger: efficient and fast RPCs in cloud microservices with near-memory reconfigurable NICs

N Lazarev, S **ang, N Adit, Z Zhang… - Proceedings of the 26th …, 2021‏ - dl.acm.org
The ongoing shift of cloud services from monolithic designs to mi-croservices creates high
demand for efficient and high performance datacenter networking stacks, optimized for fine …

Reexamining direct cache access to optimize {I/O} intensive applications for multi-hundred-gigabit networks

A Farshin, A Roozbeh, GQ Maguire Jr… - 2020 USENIX Annual …, 2020‏ - usenix.org
Memory access is the major bottleneck in realizing multi-hundred-gigabit networks with
commodity hardware, hence it is essential to make good use of cache memory that is a …

Switch code generation using program synthesis

X Gao, T Kim, MD Wong, D Raghunathan… - Proceedings of the …, 2020‏ - dl.acm.org
Writing packet-processing programs for programmable switch pipelines is challenging
because of their all-or-nothing nature: a program either runs at line rate if it can fit within …

HiveMind: A hardware-software system stack for serverless edge swarms

L Patterson, D Pigorovsky, B Dempsey… - Proceedings of the 49th …, 2022‏ - dl.acm.org
Swarms of autonomous devices are increasing in ubiquity and size, making the need for
rethinking their hardware-software system stack critical. We present HiveMind, the first …

Zerializer: Towards zero-copy serialization

A Wolnikowski, S Ibanez, J Stone, C Kim… - Proceedings of the …, 2021‏ - dl.acm.org
Achieving zero-copy I/O has long been an important goal in the networking community.
However, data serialization obviates the benefits of zero-copy I/O, because it requires the …

Ml-faas: Toward exploiting the serverless paradigm to facilitate machine learning functions as a service

E Paraskevoulakou, D Kyriazis - IEEE Transactions on Network …, 2023‏ - ieeexplore.ieee.org
Serverless computing has emerged as a revolutionary model that enables the deployment of
applications and services by raising the level of abstraction from the underline resources. Its …