Description
Failure to use API setPrecision!!!
Environment
TensorRT Version: 7.0.0.11
GPU Type: 2080ti
Nvidia Driver Version: 440.95.01
CUDA Version: 10.2
CUDNN Version: 7.6.5
Operating System + Version: centos7
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.6
Baremetal or Container (if container which image + tag):
Relevant Files
I currently use TensorRT for yolov5 model acceleration and int8 quantization method.At present, the accuracy loss of the model under int8 quantization mode is relatively large, so I want to conduct screening layer by layer.Through the official website query API setPrecision can set the level of calculation accuracy.Specific as follows:
So I changed the code as follows:
" ILayer* focus(INetworkDefinition *network, std::map<std::string, Weights>& weightMap, ITensor& input, int inch, int outch, int ksize, std::string lname) {
ISliceLayer *s1 = network->addSlice(input, Dims3{0, 0, 0}, Dims3{inch, Yolo::INPUT_H / 2, Yolo::INPUT_W / 2}, Dims3{1, 2, 2});
s1->setPrecision(DataType::kFLOAT); // Set the fp32 computational precision of this layer.
ISliceLayer *s2 = network->addSlice(input, Dims3{0, 1, 0}, Dims3{inch, Yolo::INPUT_H / 2, Yolo::INPUT_W / 2}, Dims3{1, 2, 2});
s2->setPrecision(DataType::kFLOAT); // Set the fp32 computational precision of this layer.
ISliceLayer *s3 = network->addSlice(input, Dims3{0, 0, 1}, Dims3{inch, Yolo::INPUT_H / 2, Yolo::INPUT_W / 2}, Dims3{1, 2, 2});
s3->setPrecision(DataType::kFLOAT); // Set the fp32 computational precision of this layer.
ISliceLayer *s4 = network->addSlice(input, Dims3{0, 1, 1}, Dims3{inch, Yolo::INPUT_H / 2, Yolo::INPUT_W / 2}, Dims3{1, 2, 2});
s4->setPrecision(DataType::kFLOAT); // Set the fp32 computational precision of this layer.
ITensor* inputTensors[] = {s1->getOutput(0), s2->getOutput(0), s3->getOutput(0), s4->getOutput(0)};
auto cat = network->addConcatenation(inputTensors, 4);
cat->setPrecision(DataType::kFLOAT); // Set the fp32 computational precision of this layer.
auto conv = convBnLeaky(network, weightMap, *cat->getOutput(0), outch, ksize, 1, 1, lname + ".conv");
conv->setPrecision(DataType::kFLOAT); // Set the fp32 computational precision of this layer.
return conv;
}"
But after I modified it, I tested it and found that even if I restricted all the layers to FP32, the accuracy of the output was consistent with the unrestricted accuracy!!!
So I want to make sure that I’m not using the API correctly?
thanks!!!