I’m attempting to use int8 inference on a deep network using tensorrt on c++ (the system runs fine on FP16 setting). The output of the network is a H x W heat map.
After Caliberating and compiling the engine I get a low confidence score on the peaks of a test image compared to the FP16 network (0.2 confidence instead of 0.8). I have attempted multiple types of calibrations to make sure it’s not a problem with the calibration:
- 5K images similar to my test image.
- 50K images comprised of the training set for the network.
- The test image as the sole training set.
Additionally I get the following warnings during compilation:
[W] [TRT] Detected invalid timing cache, setup a local cache instead.
[W] [TRT] Cache result detected as invalid for node: XXXX_conv_YYYY, LayerImpl: CaskConvolution
TensorRT Version: 8
GPU Type: Geforce 3080
Nvidia Driver Version: 11.4
CUDA Version: 11.4
CUDNN Version: 8.2
Operating System + Version: Ubuntu 18
Work environment: c++