Hello, I am studying about TensorRT.
I wanna make custom layers(using IPluginV2IOExt and CUDA kernel) and do INT8 quantization in the model.
So, I made a simple and same convolution layer by using CUDA kernel.
However, when I added a IInt8EntropyCalibrator2 and do Int8quantization, I realized that there is a no way to give the custom layers quantized weights, only input data and output data.
To be brief,
- make custom layers by CUDA kernel and IPluginV2IOExt
- do INT8 quantization by IInt8EntropyCalibrator2
but I think that there is a no way to give IPluginV2IOExt the data of weights. So, I cannot quantization with my custom layers.
Is it impossible?
please help.
Thank you.