No clear indication on what the format of the calibration data should be for the trtexec application should be

Description & A clear and concise description of the bug or issue.

I have my model in onnx format. I am trying to create a .engine file in the jetson xavier NX platform. I have tested that the model works fine in a desktop environment using onnxruntime. However, when I convert the model to .engine using trtexec I get bad results (using the following command):

./trtexec --onnx=resnetUnknown.onnx --int8 --saveEngine=resnetUnknown_batch5.engine --verbose

My best guess is that I need to use a calibration file to align the calibration, but I can’t find useful information on how to format the data in the file. I am guessing it’s a binary file with a sequence of floats, so if my data is 5x1x224x224, then I would have 250880 values per sample in the file, where I would just sequentially add each float value in binary format. I do this with a python script. Then I try to construct a .engine file with the following command:

./trtexec --onnx=resnetUnknown.onnx --int8 --saveEngine=resnetUnknown_batch5.engine --verbose --calib=calibration_data.bin

and I get the following error:

[08/11/2023-16:02:41] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +0, now: CPU 939, GPU 3127 (MiB)

[08/11/2023-16:02:41] [V] [TRT] Total per-runner device persistent memory is 0

[08/11/2023-16:02:41] [V] [TRT] Total per-runner host persistent memory is 1728

[08/11/2023-16:02:41] [V] [TRT] Allocated activation device memory of size 36126720

[08/11/2023-16:02:41] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +34, now: CPU 0, GPU 38 (MiB)

[08/11/2023-16:02:41] [V] [TRT] Calculating Maxima

[08/11/2023-16:02:41] [I] [TRT] Starting Calibration.

[08/11/2023-16:02:41] [E] Error[2]: [calibrator.cu::absTensorMax::141] Error Code 2: Internal Error (Assertion memory != nullptr failed. memory must be valid if nbElem != 0)

[08/11/2023-16:02:41] [V] [TRT] Trying to load shared library libcudnn.so.8

[08/11/2023-16:02:41] [V] [TRT] Loaded shared library libcudnn.so.8

[08/11/2023-16:02:41] [V] [TRT] Trying to load shared library libcudnn.so.8

[08/11/2023-16:02:41] [V] [TRT] Loaded shared library libcudnn.so.8

[08/11/2023-16:02:41] [E] Error[1]: [convolutionRunner.cpp::executeConv::462] Error Code 1: Cudnn (CUDNN_STATUS_BAD_PARAM)

[08/11/2023-16:02:41] [E] Error[3]: [engine.cpp::~Engine::306] Error Code 3: API Usage Error (Parameter check failed at: runtime/api/engine.cpp::~Engine::306, condition: mObjectCounter.use_count() == 1. Destroying an engine object before destroying objects it created leads to undefined behavior.

)

[08/11/2023-16:02:41] [E] Error[2]: [calibrator.cpp::calibrateEngine::1181] Error Code 2: Internal Error (Assertion context->executeV2(&bindings[0]) failed. )

[08/11/2023-16:02:41] [E] Error[2]: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )

[08/11/2023-16:02:41] [E] Engine could not be created from network

[08/11/2023-16:02:41] [E] Building engine failed

[08/11/2023-16:02:41] [E] Failed to create engine from model or file.

[08/11/2023-16:02:41] [E] Engine set up failed

&&&& FAILED TensorRT.trtexec [TensorRT v8502]

Now, if I try to construct a .engine file with the following command (no --int8 command):

./trtexec --onnx=resnetUnknown.onnx --saveEngine=resnetUnknown_batch5.engine --verbose --calib=calibration_data.bin

Then the optimization runs through, but seems to me like the –calib file is not being used in the process. I test the resulting .engine file and it gives me bad predictions (just like in the first case).

Environment

TensorRT Version: TensorRT 8.5.2

GPU Type: Nvidia Xavier NX JetPack 5.1.1

CUDA Version: CUDA 11.4.19

CUDNN Version: cuDNN 8.6.0

Relevant Files

How can I share files privately? I don’t want to do it publicly.

Hi,

Please refer to the following documents and samples and make sure you are following the correct steps.

Thank you.

Hi. This is unhelpful. I’m trying to go down the torch2trt path and I’m already getting errors installing the torch2trt library on the jetson. Can you please direct me to a model on the edge conversion expert? I will pay for their time if necessary.

Thanks

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Dear @luis.valle,
Just want to confirm if the model is working well with FP32 and you want to get INT8 TRT model?