Network intrusion detection of drones using recurrent neural networks
Summary Flying Ad Hoc Network (FANET) has obtained a great deal of interest over recent
times because of their significant applications. Thus, various examinations have been led on …
times because of their significant applications. Thus, various examinations have been led on …
Transparent {GPU} sharing in container clouds for deep learning workloads
Containers are widely used for resource management in datacenters. A common practice to
support deep learning (DL) training in container clouds is to statically bind GPUs to …
support deep learning (DL) training in container clouds is to statically bind GPUs to …
Grape: Practical and Efficient Graphed Execution for Dynamic Deep Neural Networks on GPUs
Achieving high performance in machine learning workloads is a crucial yet difficult task. To
achieve high runtime performance on hardware platforms such as GPUs, graph-based …
achieve high runtime performance on hardware platforms such as GPUs, graph-based …
TimeRL: Efficient Deep Reinforcement Learning with Polyhedral Dependence Graphs
Modern deep learning (DL) workloads increasingly use complex deep reinforcement
learning (DRL) algorithms that generate training data within the learning loop. This results in …
learning (DRL) algorithms that generate training data within the learning loop. This results in …
ACE: Efficient GPU Kernel Concurrency for Input-Dependent Irregular Computational Graphs
GPUs are widely used to accelerate many important classes of workloads today. However,
in this work, we observe that several important emerging classes of workloads, including …
in this work, we observe that several important emerging classes of workloads, including …
Demystifying the TensorFlow eager execution of deep learning inference on a CPU-GPU tandem
Machine Learning (ML) frameworks are tools that facilitate the development and deployment
of ML models. These tools are major catalysts of the recent explosion in ML models and …
of ML models. These tools are major catalysts of the recent explosion in ML models and …
DeepContext: A Context-aware, Cross-platform, and Cross-framework Tool for Performance Profiling and Analysis of Deep Learning Workloads
Effective performance profiling and analysis are essential for optimizing training and
inference of deep learning models, especially given the growing complexity of …
inference of deep learning models, especially given the growing complexity of …
Generating GPU compiler heuristics using reinforcement learning
GPU compilers are complex software programs with many optimizations specific to target
hardware. These optimizations are often controlled by heuristics hand-designed by compiler …
hardware. These optimizations are often controlled by heuristics hand-designed by compiler …
ACS: Concurrent Kernel Execution on Irregular, Input-Dependent Computational Graphs
GPUs are widely used to accelerate many important classes of workloads today. However,
we observe that several important emerging classes of workloads, including simulation …
we observe that several important emerging classes of workloads, including simulation …
Multi-level Analysis of GPU Utilization in ML Training Workloads
Training time has become a critical bottleneck due to the recent proliferation of large-
parameter ML models. GPUs continue to be the prevailing architecture for training ML …
parameter ML models. GPUs continue to be the prevailing architecture for training ML …