I’m using the TensorRT (TR) C++ APIs in order to inference a FP 32bits CNN model (YoloV3) that was developed and trained using Tensorflow and TensorRT Python APIs.
I wanted to convert it to Int8 model using the TRT calibrator.
I have a set of ~9000 typical pictures that I’m using for the quantization process.
I divided them to set of batches in such way that each batch include 60 pictures.
Everything is working fine but I noticed that during the operation of the buildCudaEngine API which perform the calibration process using all previously created batches, the CPU RAM consumption is increase and increase till my all RAM is occupied which cause to a crush scenario of the entire PC.
I have 24GB RAM but due to this behavior I cannot use more than ~9000 pictures for the calibration process which limited my calibration table accuracy.
Is there any way to control this issue?
Linux distro and version - Linux-x86_64, Ubuntu, 16.04
GPU type - GeForce GTX 1080
nvidia driver version - 396.26
CUDA version - Release 9.0, V9.0.252
CUDNN version - 7.1.4
Python version – 3.5.2
Tensorflow version – 1.8
TensorRT version – 18.104.22.168