Can we do INT8 inference using python API?

I only find a very simple instructions for int8 inference using python API. It says:

  1. import tensorrt as trt
  2. NUM_IMAGES_PER_BATCH = 5
    batchstream = ImageBatchStream(NUM_IMAGES_PER_BATCH, calibration_files)

However, I cannot find the definition of ImageBatchStream in python API, so I don’t know how to do the following steps. I also checked the samples, but can only find INT8 samples writing in C++. And class BatchStream is defined at a header file in samples.

So can we do INT8 inference using python API? If we can, how to build the data pipeline?

Hello,

Please reference developer guide on how to set precisions with python. https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#enable_int8_python

You can also reference this example which demonstrates how to use TensorRT to improve the inference performance by using INT8 reduced precision.

regards,
NVIDIA Enterprise Support

Thanks for your reply. I have written a program referring to https://devblogs.nvidia.com/int8-inference-autonomous-vehicles-tensorrt/. However, there still exists a problem in the definition of Int8Calibrator::write_calibrator_cache().
In the example, It accept a parameter ‘ptr’, and convert ptr by int(ptr). However, in 5.0.2.6, This function accept this parameter in type ‘data:capsule’, and int(ptr) will cause an error.

How to fix this? I think this problem is caused by the api difference between tensorrt 3 and 5

Hi qjfytz,

You can see an example of a more up-to-date int8 calibration class using TensorRT 6.0 here: https://devtalk.nvidia.com/default/topic/1065026/tensorrt/tensorrt6-dynamic-input-size-does-not-support-int8-with-calibrator-/post/5393304/#5393304