Efficient memory management for large language model serving with pagedattention

W Kwon, Z Li, S Zhuang, Y Sheng, L Zheng… - Proceedings of the 29th …, 2023 - dl.acm.org
High throughput serving of large language models (LLMs) requires batching sufficiently
many requests at a time. However, existing systems struggle because the key-value cache …

ZenFS+: Nurturing performance and isolation to ZenFS

M Oh, S Yoo, J Choi, J Park, CE Choi - IEEE Access, 2023 - ieeexplore.ieee.org
This paper proposes ZenFS+, a new storage backend of RocksDB for small-zone ZNS SSD.
RocksDB has complicated internal operations such as flush and compaction. Flush and …

Prism: Optimizing key-value store for modern heterogeneous storage devices

Y Song, WH Kim, SK Monga, C Min… - Proceedings of the 28th …, 2023 - dl.acm.org
As data generation has been on an upward trend, storing vast volumes of data cost-
effectively as well as efficiently accessing them is paramount. At the same time, today's …

Revisiting Secondary Indexing in {LSM-based} Storage Systems with Persistent Memory

J Wang, Y Lu, Q Wang, Y Zhang, J Shu - 2023 USENIX Annual Technical …, 2023 - usenix.org
LSM-based storage systems are widely used for superior write performance on block
devices. However, they currently fail to efficiently support secondary indexing, since a …

Replicating Persistent Memory {Key-Value} Stores with Efficient {RDMA} Abstraction

Q Wang, Y Lu, J Wang, J Shu - 17th USENIX Symposium on Operating …, 2023 - usenix.org
Combining persistent memory (PM) with RDMA is a promising approach to performant
replicated distributed key-value stores (KVSs). However, existing replication approaches do …

MoltDB: Accelerating Blockchain via Ancient State Segregation

J Liang, W Chen, Z Hong, H Zhu… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Blockchain store states in Log-Structured Merge (LSM) tree-based database. Due to
blockchain traceability, the growing ancient states are inevitably stored in the databases …

Perseid: A Secondary Indexing Mechanism for LSM-Based Storage Systems

J Wang, Y Lu, Q Wang, Y Zhang, J Shu - ACM Transactions on Storage, 2024 - dl.acm.org
LSM-based storage systems are widely used for superior write performance on block
devices. However, they currently fail to efficiently support secondary indexing, since a …

PetPS: Supporting huge embedding models with persistent memory

M **e, Y Lu, Q Wang, Y Feng, J Liu, K Ren… - Proceedings of the VLDB …, 2023 - dl.acm.org
Embedding models are effective for learning high-dimensional sparse data. Traditionally,
they are deployed in DRAM parameter servers (PS) for online inference access. However …

TrieKV: A High-Performance Key-Value Store Design with Memory as Its First-Class Citizen

H Sun, D Kong, S Jiang, Y Yue… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Key-value (KV) stores based on log-structured merge tree (LSM-tree) have been extensively
studied and deployed in major information technology infrastructures. Because this type of …

Optimizing File Systems on Heterogeneous Memory by Integrating {DRAM} Cache with Virtual Memory Management

Y Liu, Y Ren, M Liu, H Li, H Guo, X Miao, X Hu… - … USENIX Conference on …, 2024 - usenix.org
This paper revisits the usage of DRAM cache in DRAM-PM heterogeneous memory file
systems. With a comprehensive analysis of existing file systems with cache-based and DAX …