TRT engine created on Tesla T4, Quadro P1000 but engine creation fails on Jetson AGX Xavier

Description

I have an ONNX model created successfully from source algorithm, there are no errors in ONNX creation, I verified the ONNX model using ONNX checker.

Using this ONNX model I am able to create TRT engine and run inference successfully on Tesla T4, Quadro P1000 but, when I try to create the TRT engine on Jetson AGX Xavier board it fails with the error:

[TensorRT] ERROR: …/builder/cudnnBuilderBlockChooser.cpp (117) - Assertion Error in buildMemGraph: 0 (mg.nodes[mg.regionIndices[outputRegion]].size == mg.nodes[mg.regionIndices[inputRegion]].size)

Please note

  • The ONNX model created, takes 2 inputs with dynamic shapes, the optimization profiles are defined for both the inputs
  • When both the inputs are of fixed dimensions the TRT engine is created without any errors even on Jetson AGX Xavier
  • TRT engine with 2 dynamic shape inputs is created successfully on Tesla T4, Quadro P1000 but fails on Jetson AGX Xavier

Environment

Jetson AGX Xavier with Jetpack 4.5.1 installed

Hi,
This looks like a Jetson issue. We recommend you to raise it to the respective platform from the below link

Thanks!

1 Like

Hi @NVES ,

Thank you for the reply, I changed the category of the issue.

Hi,

Which TensorRT version do you use on the desktop environment.
More, could you share the detailed log with --verbose enable with us?

Thanks.

Hi @AastaLLL ,

On Tesla T4, Quadro P1000 I used Tensorrt version 7.2.3.4 and my Jetson Xavier board has Tensorrt version 7.1.3.0 (installed as part of Jetpack 4.5.1)

Please find the verbose logs generated on Jetson AGX Xavier here

Do you need the verbose logs generated on Tesla T4, Quadro P1000?

Thank you!

Hi @AastaLLL , any update on this? I shared the verbose logs and TensorRT version details.

Hi,

This error often happens when the model is generated with static input dimension but trying to use dynamic batch with trtexec.
Usually, the root cause is that the reshape size is different from the actual input size.

Could you try to generate the ONNX file with dynamic_axes and try it again?
Or since there is no issue in v7.2, you can wait for our announcement for the new release.

Thanks.

2 Likes

Hi @AastaLLL ,

I generated ONNX file with dynamic_axes as well and tried converting that ONNX to TRT but the engine creation fails with the same error.

After reading some more issues in the developer forums, I replaced the reshape function with view function in my source code, but that also does not help in resolving the error.

Is there no workaround to make this ONNX model having dynamic_axes to work with Jetpack 4.5.1 which has TensorRT version 7.1.3.0?

Thank you.

Hi,

Could you share the model with us?
We need to reproduce it internally to check if any workaround for JetPack4.5.1.

Thanks.

Hi @AastaLLL ,

I uploaded and shared the ONNX model with dynamic_axes in personal message, please check your inbox.

Thank you.

Hi,

Please update your model into float-type input format with the below comment:

After that, you will meet a reshape issue like below:

[07/27/2021-16:22:53] [E] [TRT] Reshape_78: -1 wildcard has infinite number of solutions or no solution
[07/27/2021-16:22:53] [E] [TRT] Builder failed while analyzing shapes.

The error indicates that the reshape layer use -1 which is supported with TensorRT currently.
Is it possible to replace the dimension with some pre-defined value?

Thanks.

1 Like

Hi @AastaLLL ,

Thank you for suggesting the workaround.

You mentioned: “The error indicates that the reshape layer use -1 which is supported with TensorRT currently.”

If -1 is supported by TensorRT should I still be replacing -1 with a pre-defined value? Or did you mean something else?

Also can you please let me know when will the new JetPack version 4.6 be released? I read that JetPack 4.6 will be released by end of July 2021, it is last week of July, I don’t find the release updates anywhere else in the forums. Where can I find the release related updates?

Hi @AastaLLL ,

I reflashed my Jetson AGX Xavier with JetPack4.6, it has TensorRT 8.0.16. I used the same ONNX I shared with you earlier which has dynamic_axes for running inference on the board.

Now there is a new error:

[08/06/2021-12:38:47] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1372, GPU 9190 (MiB)
[08/06/2021-12:38:47] [E] Error[10]: [optimizer.cpp::computeCosts::1855] Error Code 10: Internal Error (Could not find any implementation for node PWN(qatm/sub_3, PWN(qatm/softmax_coef_ref/read:0 + (Unnamed Layer* 259) [Shuffle], qatm/mul_3)).)
[08/06/2021-12:38:47] [E] Error[2]: [builder.cpp::buildSerializedNetwork::417] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed.)

Please find the complete verbose logs here

I tried allocating the max workspace size in an attempt to resolve the error, but that is not resolving the error, I see the same error despite change in workspace size.

Is this an error specific to hardware or to software?

Your insights will be very helpful.

Thank you!

1 Like

Hi,

Sorry for the typo.

The error indicates that the reshape layer use -1 which is NOT supported with TensorRT currently.

For JetPack 4.6, do you use the same command for TensorRT 7.2?
It’s expected that a model that can work with TensorRT v7.2 should also be supported with TensorRT v8.0.

Thanks.

Hi @AastaLLL ,

Thanks for the clarification!

I used the same command to generate TRT engine from ONNX in Tensorrt v7.2 and v8.0

The problem seems to be with memory allocation on the Jetson Xavier board which I am not able to understand how to resolve.

Here are my observations:

  1. I reduced the input dimensions in the optimization profiles and the TRT engine creation, inference is successful on Xavier board
  2. Whereas, with larger input dimensions in optimization profiles I am able to generate TRT engine and do inference without any errors on Quadro P1000 GPU
  3. Despite setting a max workspace size in the range of 10-15GB, the memory allocation issue persists on Xavier board, whereas, a few MBs of workspace size is sufficient on Quadro P1000 for engine creation

I am using the same ONNX model and same commands on all GPU platforms.

  • Is there something that I can change in my code on Xavier, to ensure the memory is allocated for larger inputs as well?
  • Why is this memory related issue so specific to Jetson AGX Xavier?

Thank you.

@AastaLLL

I also encountered the similar error.

[graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 9: Internal Error (Postprocessor/Reshape: -1 wildcard solution does not fit in int32_t).

I used graphsurgeon to remove the wildcard setting from the Reshape node, and the error changed to another.

[graphShapeAnalyzer.cpp::throwIfError::1306] Error Code 9: Internal Error (Postprocessor/Reshape: reshape changes volume)

I confirmed that able to perform inference using the model that I edited using graphsurgeon.

Also, I can confirm that the input and output sizes of the Reshape node are the same. (Specifically, Reshape from 1x2034x4 to 2034x4.)

I am having trouble with the conversion and would appreciate any hints.

Hi,

workspace only shows part of the memory.
Could you check the real memory used in P1000 via nvidia-smi?

Thanks.