Optimizing and Accelerating AI Inference with the TensorRT Container from NVIDIA NGC

Originally published at: Optimizing and Accelerating AI Inference with the TensorRT Container from NVIDIA NGC | NVIDIA Technical Blog

Natural language processing (NLP) is one of the most challenging tasks for AI because it needs to understand context, phonics, and accent to convert human speech into text. Building this AI workflow starts with training a model that can understand and process spoken language to text. BERT is one of the best models for this…