FPGA HLS today: successes, challenges, and opportunities

J Cong, J Lau, G Liu, S Neuendorffer, P Pan… - ACM Transactions on …, 2022 - dl.acm.org
The year 2011 marked an important transition for FPGA high-level synthesis (HLS), as it
went from prototy** to deployment. A decade later, in this article, we assess the progress …

The future of FPGA acceleration in datacenters and the cloud

C Bobda, JM Mbongue, P Chow, M Ewais… - ACM Transactions on …, 2022 - dl.acm.org
In this article, we survey existing academic and commercial efforts to provide Field-
Programmable Gate Array (FPGA) acceleration in datacenters and the cloud. The goal is a …

Co-design hardware and algorithm for vector search

W Jiang, S Li, Y Zhu, J de Fine Licht, Z He… - Proceedings of the …, 2023 - dl.acm.org
Vector search has emerged as the foundation for large-scale information retrieval and
machine learning systems, with search engines like Google and Bing processing tens of …

INSPIRE: in-s torage p rivate i nformation re trieval via protocol and architecture co-design

J Lin, L Liang, Z Qu, I Ahmad, L Liu, F Tu… - Proceedings of the 49th …, 2022 - dl.acm.org
Private Information Retrieval (PIR) plays a vital role in secure, database-centric applications.
However, existing PIR protocols explore a massive working space containing hundreds of …

Sorting in memristive memory

MR Alam, MH Najafi, N TaheriNejad - ACM Journal on Emerging …, 2022 - dl.acm.org
Sorting data is needed in many application domains. Traditionally, the data is read from
memory and sent to a general-purpose processor or application-specific hardware for …

NASCENT2: Generic near-storage sort accelerator for data analytics on SmartSSD

S Salamat, H Zhang, YS Ki, T Rosing - ACM Transactions on …, 2022 - dl.acm.org
As the size of data generated every day grows dramatically, the computational bottleneck of
computer systems has shifted toward storage devices. The interface between the storage …

Debugging in the brave new world of reconfigurable hardware

J Ma, G Zuo, K Loughlin, H Zhang, A Quinn… - Proceedings of the 27th …, 2022 - dl.acm.org
Software and hardware development cycles have traditionally been quite distinct. Software
allows post-deployment patches, which leads to a rapid development cycle. In contrast …

NDSEARCH: Accelerating graph-traversal-based approximate nearest neighbor search through near data processing

Y Wang, S Li, Q Zheng, L Song, Z Li… - 2024 ACM/IEEE 51st …, 2024 - ieeexplore.ieee.org
Approximate nearest neighbor search (ANNS) is a key retrieval technique for vector
database and many data center applications, such as person re-identification and …

[HTML][HTML] A review on computational storage devices and near memory computing for high performance applications

D Fakhry, M Abdelsalam, MW El-Kharashi… - … , Devices, Circuits and …, 2023 - Elsevier
The von Neumann bottleneck is imposed due to the explosion of data transfers and
emerging data-intensive applications in heterogeneous system architectures. The …

Near-storage processing for solid state drive based recommendation inference with smartssds®

M Soltaniyeh, V Lagrange Moutinho Dos Reis… - Proceedings of the …, 2022 - dl.acm.org
Deep learning-based recommendation systems are extensively deployed in numerous
internet services, including social media, entertainment services, and search engines, to …