Hello, everyone. I am trying to compress my network to 8bit. But i found that some customized layers didn’t work. So maybe i need to finetune the network in 8bit. But i can not find any document about that calibration table. Can anyone tell me how to interpret that cache table or where i can find the document about the details of 8bit calibration? Thanks !
INT8 can only run on GPU architecture=6.1 platform.
TX2 is sm=6.2 design and doesn’t support the INT8 feature.
Thanks for your reply. The platform is P4 which supports 8 bit. Because I can not find an area in this forum for P4/P40, so i posted my problem here.
Sorry for the late reply.
The INT8 key concept can be found in our user guide:
3.7. SampleINT8 - Calibration and 8-bit Inference
Please find more information on it.
In 3.7. SampleINT8 - Calibration and 8-bit Inference, it writes “The parameters are recorded in the table. If the network or calibration set changes, it is the application’s responsibility to invalidate the cache.”
But i can’t find any details about these parameters. The cache file saved by tensorrt is stored as binary format.
(Unnamed ITensor* 38): 3c3604b4
(Unnamed ITensor* 88): 3c42b64c
(Unnamed ITensor* 55): 3cb2d9e0
(Unnamed ITensor* 45): 3c22703f
(Unnamed ITensor* 9): 3c4023ca
(Unnamed ITensor* 30): 3c5e5988
(Unnamed ITensor* 123): 3c3b5bb9
(Unnamed ITensor* 41): 3ccc51b9
(Unnamed ITensor* 6): 3c6ac62b
(Unnamed ITensor* 31): 3c663928
(Unnamed ITensor* 26): 3ce3acfb
Because of the some custom layers, i can’t use the cache table directly.
If i want to finetune the network, i need to know how to use these parameters in my training codes.
The calibration table is dumped from memory buffer directly.
We have a native sample to demonstrate INT8 feature:
... Int8EntropyCalibrator calibrator(calibrationStream, FIRST_CAL_BATCH); ...
Please check it for information.
Can you please explain what these Hex numbers mean in the calibration table?
The hex number reflect the range that int8 calibration process uses. For example, if a hex number is 114. Then that tensor will be valued in [-114, 114].
I think you can use the calibration table directly for your model with customized layers. TRT just take layers with cache into consideration and leave your customized layers.
I also tried doing int8 calibration directly on my model (which includes a bunch of customized layers), but I met the following error:
But I succeed the calibration process by providing the program a fake cache table, this cache table is generated from a subset of the whole model (say backbone).
My computer has 64GB as memory, so I kinda confused by the error log.
Anyone can help? Thanks.
Please help to open a new topic for your issue. Thanks