API for getting INT8 calibration scale factors after calibration is finished?


Is there API for fetching the INT8 calibration scale factors for a given tensor after performing the INT8 calibration process?

Currently this information is encoded into the INT8 calibration cache, but the existing API only gives a raw pointer to a buffer. According to the docs, the calibration cache is an “internal implementation detail”, so I take it I should not “reverse engineer it” to obtain these scale factors, since this can change “any time” at Nvidia’s discretion.

My use case is:

  • Build DLA engine + Safety + INT8.
  • Due to Safety, the network must be reformat-free.
  • Therefore, the I/O tensors are INT8.
  • I want to end up with FP32 tensors → I need to create a reformatting layer INT8 → Fp32 myself.
  • I can only accomplish that if I know the correct scale factors to apply to convert from INT8 to FP32. This information is obtained somewhere in the INT8 calibration process - how do I get it?



We don’t think there is a scale for the output if it is int8. If this scale does exist, we can only think of one way to get it is by reading the calibration cache file.

Thank you.