GTC 2020: PyTorch from Research to Production

GTC 2020 S21928
Presenters: Grzegorz Karch,NVIDIA
Abstract
Learn how to get your neural network from the PyTorch framework into production. Explore ways to handle complex neural network architectures during deployment. We’ll show how to transform a neural network developed in PyTorch into a model ready for a production environment and exemplify the workflow on a conversational AI system. For full understanding, you should be familiar with PyTorch framework and have some interest in model deployment for inference. We’ll demonstrate the neural network system on TensorRT Inference Server (TRTIS).

Watch this session
Join in the conversation below.

Hi! Cannot play the video. It says “The media could not be loaded, either because the server or network failed or because the format is not supported”

Hi, please try again, it should be working now.

Thanks for the talk with convesrtaionalAI as example (even though it’s not my area of research).

Questions:
1: Slide 42: where can I find deployer.py? Is the deployer specific to your conversationAI models? Or is it general, and theoretically could work for any model?

2: Slide 32: so, if I’ve nn.Conv1d in my model, I should convert to 2d. But, does it impact the performance/quality of the model originally designed with 1d?

3: Your presentation is tailored to deployment on Triton Inference Server. I guess the basic steps required to deploy for example on any other cloud platform is not much different. Correct?