Today, the INT8 Calibration Cache is a text file that looks like this:
conv2d1: deadbeef
...
So the format is: <tensor_name>:<scale factor>. Where scale factor is (presumably) the hex representation of the floating point number in IEEE754 big endian format.
Questions:
Let’s say that we take the Calibration Cache and manually parse it without using any TensorRT API. Can we assume that the format that I mentioned above is stable and dependable? Or can the format change on a new TRT release without notice (since it’s undocumented)?
If we can assume the format is stable - could it be documented in the official TensorRT documentation? So that we can depend on solid information instead of reverse engineering.
We always recommend that you please use the TensorRT API.
The format of the calibration cache used by TensorRT is not officially documented or supported for manual parsing without using the provided TensorRT APIs. As a result, relying on the format and attempting to parse it without the official APIs may lead to compatibility issues and unexpected behavior.
TensorRT releases may introduce changes, improvements, or optimizations that could impact the calibration cache format.
I understand that, but TensorRT does not provide any API for extracting information from the calibration cache (other than reading it into an opaque data blob) - what should be used then?
Let’s say I want to extract the scale factor of a given tensor name from the calibration cache - what API can I use to get that information?