TRT for yolov3: FP16 and INT8 optimization failed

Using the following repo: https://github.com/vat-nvidia/deepstream-plugins, I was able to get an optimized model for the default YOLOv3 model with FP32 precision (kFLOAT). However, it fails when I try to use other precisions in trt-yolo-app:
a) kHALF
Platform doesn’t support this precision.
trt-yolo-app: yolo.cpp:150: void Yolo::createYOLOEngine(int, std::__cxx11::string, std::__cxx11::string, std::__cxx11::string, nvinfer1::DataType, Int8EntropyCalibrator*): Assertion 0' failed. b) kINT8 I'm currently trying to get this working with the default calibration table, as the app throws an exception: Using cached calibration table to build the engine trt-yolo-app: ../builder/cudnnBuilder2.cpp:1227: nvinfer1::cudnn::Engine* nvinfer1::builder::buildEngine(nvinfer1::CudaEngineBuildConfig&, const nvinfer1::cudnn::HardwareContext&, const nvinfer1::Network&): Assertion it != tensorScales.end()’ failed.

Also, a few questions:

  1. If kSAVE_DETECTIONS is configured as true, the images appear in the folder, but there are no bounding boxes drawn. Is that how it’s supposed to be?
  2. Is there a tool to perform the INT8 calibration on a custom dataset? I see some related classes and the calibration table for the default YOLO, but not a complete tool.
    There are related bits and pieces in README for NvYolo plugin to GStreamer, but we are not testing DeepStream SDK just yet.
  3. Batch_size parameter in the sample app - I would assume that it is used for specifying multiple images to be sent to the GPU at once (as a batch), which should be faster than processing images one by one.
    But using batch_size of 4 shows a increase in the reported frame processing time to about 17ms per image.
  4. Github repo specifies CUDA 9.2 and TensorRT 4.x as a requirement. However, since darknet uses CUDA 9.0, that’s the version I used. May it cause issues or lead to performance decrease?

Software and hardware used:
Ubuntu 16.04.5, Nvidia graphics driver 380.134, CUDA 9.0, CUDNN 7.1.3, TensorRT 4.0.1.6,
Asus GTX 1080 Ti Turbo at default clocks.

Hello,

Your questions are deepstream-plugins repo specific. Please contact https://github.com/vat-nvidia/deepstream-plugins.