TensorRT 8.0: What’s New

The latest release of high performance deep learning inference SDK, TensorRT 8 GA is now available for download. This version of TensorRT includes:

  • BERT Inference in 1.2 ms with new transformer optimizations
  • Achieve accuracy equivalent to FP32 with INT8 precision using Quantization Aware Training
  • Support for Sparsity for faster inference on Ampere GPUs

Learn more about the new features and resources here.