Hi,
I have used data set to generate int8 table. And then I used int8 table and caffe model to generate tensorrt cache file to do inference for face recognition.when I used the same tensorrt cache file, the result would be the same, but every one I delete cache model and regenerate it use the same int8 table and the same caffe model, the result would be different, about 0.1% bias. So I think every time generate TensorRT cache file maybe slightly different. So it is a bug or normal phenomenon?
Thanks.
Hi,
Linux distro and version: Ubuntu16.04
GPU type: gtx1080ti
nvidia driver version: 430.14
CUDA version: 10.1
CUDNN version: 7.5.0
TensorRT version 5.1.2.2
I also found something strange in int8 mode on TensorRT 4 and TensorRT 3.
Run sampleINT8 with a little samples,
./sample_int8 mnist batch=3 start=100 score=1
print the probability of correct label predicted from the model.
int calculateScore(float* batchProb, float* labels, int batchSize, int outputSize, int threshold)
{
int success = 0;
for (int i = 0; i < batchSize; i++)
{
float* prob = batchProb + outputSize*i, correct = prob[(int)labels[i]];
// result print
std::cout<<"correct prob:"<<correct<<std::endl;
int better = 0;
for (int j = 0; j < outputSize; j++)
if (prob[j] >= correct)
better++;
if (better <= threshold)
success++;
}
return success;
}
you will see different prob for the same image in different runnings.
FP32 run:1 batches of size 3 starting at 100
correct prob:0.999968
correct prob:0.999975
correct prob:0.999854
correct prob:0.999968
correct prob:0.999975
correct prob:0.999854
Top1: 1, Top5: 1
Processing 3 images averaged 0.0382293 ms/image and 0.114688 ms/batch.
FP16 run:1 batches of size 3 starting at 100
Engine could not be created at this precision
INT8 run:1 batches of size 3 starting at 100
correct prob:0.999952
correct prob:0.999965
correct prob:0.999665
correct prob:0.999952
correct prob:0.999965
correct prob:0.999665
=================================================
FP32 run:1 batches of size 3 starting at 100
correct prob:0.999968
correct prob:0.999975
correct prob:0.999854
correct prob:0.999968
correct prob:0.999975
correct prob:0.999854
Top1: 1, Top5: 1
Processing 3 images averaged 0.0404373 ms/image and 0.121312 ms/batch.
FP16 run:1 batches of size 3 starting at 100
Engine could not be created at this precision
INT8 run:1 batches of size 3 starting at 100
correct prob:0.999953
correct prob:0.999967
correct prob:0.99967
correct prob:0.999953
correct prob:0.999967
correct prob:0.99967