GTC 2020 S21736
Presenters: Tianhao Xu,NVIDIA
We’ll give an overview of the TensorRT Hyperscale Inference Platform. We start with a deep dive into current features and internal architecture, then go into deployment possibilities in a generic deployment ecosystem. Next, we’ll give a hands-on overview of NVIDIA Bert, FasterTransformer and TRT-optimized BERT inference. Then we’ll get into how to deploy BERT TensorFlow model with custom op, how to deploy BERT TensorRT model with plugins, and benchmarking. We’ll finish with other optimization techniques and open discussion.
Watch this session
Join in the conversation below.