Query processing on tensor computation runtimes

D He, S Nakandala, D Banda, R Sen, K Saur… - arxiv preprint arxiv …, 2022 - arxiv.org
The huge demand for computation in artificial intelligence (AI) is driving unparalleled
investments in hardware and software systems for AI. This leads to an explosion in the …

Optimizing tensor programs on flexible storage

M Schleich, A Shaikhha, D Suciu - … of the ACM on Management of Data, 2023 - dl.acm.org
Tensor programs often need to process large tensors (vectors, matrices, or higher order
tensors) that require a specialized storage format for their memory layout. Several such …

Autoscheduling for sparse tensor algebra with an asymptotic cost model

W Ahrens, F Kjolstad, S Amarasinghe - Proceedings of the 43rd ACM …, 2022 - dl.acm.org
While loop reordering and fusion can make big impacts on the constant-factor performance
of dense tensor programs, the effects on sparse tensor programs are asymptotic, often …

nsdb: Architecting the next generation database by integrating neural and symbolic systems

Y Yuan, B Tang, T Zhou, Z Zhang, J Qin - Proceedings of the VLDB …, 2024 - dl.acm.org
In this paper, we propose nsDB, a novel neuro-symbolic database system that integrates
neural and symbolic system architectures natively to address the weaknesses of each …

Bagua: scaling up distributed learning with system relaxations

S Gan, X Lian, R Wang, J Chang, C Liu, H Shi… - arxiv preprint arxiv …, 2021 - arxiv.org
Recent years have witnessed a growing list of systems for distributed data-parallel training.
Existing systems largely fit into two paradigms, ie, parameter server and MPI-style collective …

Indexed Streams: A formal intermediate representation for fused contraction programs

S Kovach, P Kolichala, T Gu, F Kjolstad - Proceedings of the ACM on …, 2023 - dl.acm.org
We introduce indexed streams, a formal operational model and intermediate representation
that describes the fused execution of a contraction language that encompasses both sparse …

In-database machine learning with corgipile: Stochastic gradient descent without full data shuffle

L Xu, S Qiu, B Yuan, J Jiang, C Renggli, S Gan… - Proceedings of the …, 2022 - dl.acm.org
Stochastic gradient descent (SGD) is the cornerstone of modern ML systems. Despite its
computational efficiency, SGD requires random data access that is inherently inefficient …

Stochastic gradient descent without full data shuffle: with applications to in-database machine learning and deep learning systems

L Xu, S Qiu, B Yuan, J Jiang, C Renggli, S Gan… - The VLDB Journal, 2024 - Springer
Modern machine learning (ML) systems commonly use stochastic gradient descent (SGD) to
train ML models. However, SGD relies on random data order to converge, which usually …

A Comparison of End-to-End Decision Forest Inference Pipelines

H Guan, S Masood, M Dwarampudi, V Gunda… - Proceedings of the …, 2023 - dl.acm.org
Decision forest, including RandomForest, XGBoost, and LightGBM, dominates the machine
learning tasks over tabular data. Recently, several frameworks were developed for decision …

Multi-cluster high performance computing method based on multimodal tensor in enterprise resource planning system

H Zhang, R **a, H Ye, D Shi, P Li, W Fan - Physical Communication, 2024 - Elsevier
The big data representation and processing method based on multimodal tensors can
achieve the fusion representation of different types of data, and perform correlation analysis …