Improving INT8 Accuracy Using Quantization Aware Training and the NVIDIA Transfer Learning Toolkit

Originally published at: https://developer.nvidia.com/blog/improving-int8-accuracy-using-quantization-aware-training-and-the-transfer-learning-toolkit/

Deep neural network (DNN) models are routinely used in applications requiring analysis of video stream content. These may include object detection, classification, and segmentation. Typically, these models are trained on servers with high-end GPUs, either in stand-alone servers, such as NVIDIA DGX1, or on servers available in data centers or private or public clouds. Such…