GTC 2020: PaddlePaddle with Distributed Training API, Automatic Mixed Precision, and TensorRT Integration

GTC 2020 S21436
Presenters: Bai-Cheng Jeng,NVIDIA; Jie Fang,NVIDIA; Daming Lu,Baidu USA
Abstract
We’ll introduce PaddlePaddle (PArallel Distributed Deep LEarning), an easy-to-use, efficient, flexible, and scalable deep-learning platform, which has already deployed to real business scenarios. In training phase, PaddlePaddle provides a high-level API for distributed training named Fleet. It can distribute a training task to GPU cluster. To further increase performance, PaddlePaddle can use mixed precision training through adding few lines of code, which achieves significant speedup by Tensor Core. In inference phase, PaddlePaddle integrates TensorRT to fuse operations, and can infer models in lower precision mode to fully utilize GPU resources. With all the three technologies mentioned above, developers can significantly cut the needed time to train large-scale tasks and deploy models for operation.

Watch this session
Join in the conversation below.