محقق Google

M Akhtar, O Benjelloun, C Conforti… - Advances in …, 2025‏ - proceedings.neurips.cc‏

Data is a critical resource for machine learning (ML), yet working with data remains a key
friction point. This paper introduces Croissant, a metadata format for datasets that creates a …‏

ذخیره ارجاع بیان شده در 31 یافته مقاله‌های مربوط تمام نسخه‌های 15 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Orion: Interference-aware, fine-grained GPU sharing for ML applications‏

F Strati, X Ma, A Klimovic - … of the Nineteenth European Conference on …, 2024‏ - dl.acm.org‏

GPUs are critical for maximizing the throughput-per-Watt of deep neural network (DNN)
applications. However, DNN applications often underutilize GPUs, even when using large …‏

ذخیره ارجاع بیان شده در 31 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] vldb.org

Fastflow: Accelerating deep learning model training with smart offloading of input data pipeline‏

T Um, B Oh, B Seo, M Kweun, G Kim… - Proceedings of the VLDB …, 2023‏ - dl.acm.org‏

When training a deep learning (DL) model, input data are pre-processed on CPUs and
transformed into tensors, which are then fed into GPUs for gradient computations of model …‏

ذخیره ارجاع بیان شده در 33 یافته مقاله‌های مربوط تمام نسخه‌های 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

An overview of the data-loader landscape: Comparative performance analysis‏

I Ofeidis, D Kiedanski… - 2024 IEEE International …, 2024‏ - ieeexplore.ieee.org‏

The efficiency of Deep Learning (DL) training jobs is critically dependent on dataloaders,
which facilitate the transfer of data from storage to DL-accelerated hardware during training …‏

ذخیره ارجاع بیان شده در 13 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Pecan:{Cost-Efficient}{ML} Data Preprocessing with Automatic Transformation Ordering and Hybrid Placement‏

D Graur, O Mraz, M Li, S Pourghannad… - 2024 USENIX Annual …, 2024‏ - usenix.org‏

Input data preprocessing is a common bottleneck in machine learning (ML) jobs, that can
significantly increase training time and cost as expensive GPUs or TPUs idle waiting for …‏

ذخیره ارجاع بیان شده در 5 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Where is my training bottleneck? hidden trade-offs in deep learning preprocessing pipelines‏

A Isenko, R Mayer, J Jedele, HA Jacobsen - Proceedings of the 2022 …, 2022‏ - dl.acm.org‏

Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep
the training processes busy. Maximizing resource utilization is becoming more challenging …‏

ذخیره ارجاع بیان شده در 31 یافته مقاله‌های مربوط تمام نسخه‌های 6

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Data pipeline quality: Influencing factors, root causes of data-related issues, and processing problem areas for developers‏

H Foidl, V Golendukhina, R Ramler… - Journal of Systems and …, 2024‏ - Elsevier‏

Data pipelines are an integral part of various modern data-driven systems. However, despite
their importance, they are often unreliable and deliver poor-quality data. A critical step …‏

ذخیره ارجاع بیان شده در 13 یافته مقاله‌های مربوط تمام نسخه‌های 7

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

tf. data service: A case for disaggregating ML input data processing‏

A Audibert, Y Chen, D Graur, A Klimovic… - Proceedings of the …, 2023‏ - dl.acm.org‏

Machine learning (ML) computations commonly execute on expensive specialized
hardware, such as GPUs and TPUs, which provide high FLOPs and performance-per-watt …‏

ذخیره ارجاع بیان شده در 18 یافته مقاله‌های مربوط تمام نسخه‌های 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models‏

Y Lee, H Kim, M Rhu - 2024 ACM/IEEE 51st Annual …, 2024‏ - ieeexplore.ieee.org‏

Training recommendation systems (RecSys) faces several challenges as it requires the
“data preprocessing” stage to preprocess an ample amount of raw data and feed them to the …‏

ذخیره ارجاع بیان شده در 2 یافته مقاله‌های مربوط تمام نسخه‌های 6

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Intune: Reinforcement learning-based data pipeline optimization for deep recommendation models‏

K Nagrecha, L Liu, P Delgado… - Proceedings of the 17th …, 2023‏ - dl.acm.org‏

Deep learning-based recommender models (DLRMs) have become an essential component
of many modern recommender systems. Several companies are now building large compute …‏

ذخیره ارجاع بیان شده در 7 یافته مقاله‌های مربوط تمام نسخه‌های 4

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Plumber: Diagnosing and removing performance bottlenecks in machine learning data pipelines

Croissant: A metadata format for ml-ready datasets‏

Orion: Interference-aware, fine-grained GPU sharing for ML applications‏

Fastflow: Accelerating deep learning model training with smart offloading of input data pipeline‏

An overview of the data-loader landscape: Comparative performance analysis‏

Pecan:{Cost-Efficient}{ML} Data Preprocessing with Automatic Transformation Ordering and Hybrid Placement‏

Where is my training bottleneck? hidden trade-offs in deep learning preprocessing pipelines‏

[HTML][HTML] Data pipeline quality: Influencing factors, root causes of data-related issues, and processing problem areas for developers‏

tf. data service: A case for disaggregating ML input data processing‏

PreSto: An In-Storage Data Preprocessing System for Training Recommendation Models‏

Intune: Reinforcement learning-based data pipeline optimization for deep recommendation models‏