GTC 2020: Improve ML Training Performance with Amazon SageMaker Debugger (Presented by Amazon Web Services)

GTC 2020 S22493
Presenters: Shashank Prasanna,Amazon Web Services; Satadal Bhattacharjee, Amazon Web Services
Abstract
During ML model training, it’s challenging to ensure that models are progressively learning the correct values for different parameters and to analyze and debug model characteristics without building additional tools, making the process time-consuming and cumbersome. With Amazon SageMaker Debugger, developers can get complete insights into the training process by automating data capture and analysis from training runs without code changes. We’ll take a closer look at how you can define rules to monitor and analyze tensors and watch for issues in your model. By monitoring training flow, developers can improve GPU utilization, reduce troubleshooting time during training, and build high-quality models.

Watch this session
Join in the conversation below.