Deep lake: A lakehouse for deep learning

S Hambardzumyan, A Tuli, L Ghukasyan… - arxiv preprint arxiv …, 2022 - arxiv.org
Traditional data lakes provide critical data infrastructure for analytical workloads by enabling
time travel, running SQL queries, ingesting data with ACID transactions, and visualizing …

Efficient Data Transfer in Shared-storage Cloud Data Processing Systems with OPTICS

G Paulos, T Zhang, Y Zhang, G Zhu, J Karimov… - Proceedings of the 33rd …, 2023 - dl.acm.org
We propose OPTICS, a middleware that optimizes data transfer between the compute layer
and the shared object storage in today's cloud data processing systems. OPTICS …

SOLAR: A Highly Optimized Data Loading Framework for Distributed Training of CNN-based Scientific Surrogates

B Sun, X Yu, C Zhang, J Tian, S **, K Iskra… - arxiv preprint arxiv …, 2022 - arxiv.org
CNN-based surrogates have become prevalent in scientific applications to replace
conventional time-consuming physical approaches. Although these surrogates can yield …

Exoshuffle-CloudSort

FS Luan, S Wang, S Yagati, S Kim, K Lien, I Ong… - arxiv preprint arxiv …, 2023 - arxiv.org
We present Exoshuffle-CloudSort, a sorting application running on top of Ray using the
Exoshuffle architecture. Exoshuffle-CloudSort runs on Amazon EC2, with input and output …

ASRDataset: A Multi-granularity Shuffle System for Preparing Large-scale ASR Training Data

F Jie, H Zhang, J Wang, Z Yu - 2023 IEEE International …, 2023 - ieeexplore.ieee.org
Automatic Speech Recognition (ASR) is an essential task in the field of artificial intelligence.
With the widespread application of deep learning (DL), end-to-end ASR systems have …

[PDF][PDF] CONCEPTION D'UNE MÉTHODE D'APPRENTISSAGE MACHINE POUR LA DÉTECTION DES MICROORGANISMES BACTÉRIENS À PARTIR DE DONNÉES …

N DE MONTIGNY - 2024 - archipel.uqam.ca
Dans le domaine de la bioinformatique, des programmes sont conçus afin de permettre et
faciliter la gestion, le traitement, l'analyse et l'interprétation de données biologiques de tout …

[BOEK][B] Efficient Shuffle for Flash Burst Computing

Y Li - 2022 - search.proquest.com
Shuffle is the operation of exchanging arbitrary data among a group of servers, and it is a
fundamental communication primitive in distributed computing. In particular, shuffle has …