INT8 Calibration Cache format - could it be officially documented?

Hi,

Today, the INT8 Calibration Cache is a text file that looks like this:

conv2d1: deadbeef
...

So the format is: <tensor_name>:<scale factor>. Where scale factor is (presumably) the hex representation of the floating point number in IEEE754 big endian format.

Questions:

  • Let’s say that we take the Calibration Cache and manually parse it without using any TensorRT API. Can we assume that the format that I mentioned above is stable and dependable? Or can the format change on a new TRT release without notice (since it’s undocumented)?

  • If we can assume the format is stable - could it be documented in the official TensorRT documentation? So that we can depend on solid information instead of reverse engineering.

Thanks!

1 Like

Hi, Please refer to the below links to perform inference in INT8

Thanks!

Those resources do not answer my question.

Hi,

We always recommend that you please use the TensorRT API.
The format of the calibration cache used by TensorRT is not officially documented or supported for manual parsing without using the provided TensorRT APIs. As a result, relying on the format and attempting to parse it without the official APIs may lead to compatibility issues and unexpected behavior.

TensorRT releases may introduce changes, improvements, or optimizations that could impact the calibration cache format.

Thank you.

Hi,

I understand that, but TensorRT does not provide any API for extracting information from the calibration cache (other than reading it into an opaque data blob) - what should be used then?

Let’s say I want to extract the scale factor of a given tensor name from the calibration cache - what API can I use to get that information?

Thanks!

1 Like

Hi,

At this moment, we do not have APIs available, we will provide them in future releases.

Thank you.