GTC 2020 S21928 Presenters: Grzegorz Karch,NVIDIA Abstract
Learn how to get your neural network from the PyTorch framework into production. Explore ways to handle complex neural network architectures during deployment. We’ll show how to transform a neural network developed in PyTorch into a model ready for a production environment and exemplify the workflow on a conversational AI system. For full understanding, you should be familiar with PyTorch framework and have some interest in model deployment for inference. We’ll demonstrate the neural network system on TensorRT Inference Server (TRTIS).
Thanks for the talk with convesrtaionalAI as example (even though it’s not my area of research).
Questions:
1: Slide 42: where can I find deployer.py? Is the deployer specific to your conversationAI models? Or is it general, and theoretically could work for any model?
2: Slide 32: so, if I’ve nn.Conv1d in my model, I should convert to 2d. But, does it impact the performance/quality of the model originally designed with 1d?
3: Your presentation is tailored to deployment on Triton Inference Server. I guess the basic steps required to deploy for example on any other cloud platform is not much different. Correct?