Running dgpu optimized TensorRT model on Nvidia Jetson NX

Description

Hello everyone,

I am trying to optimize model using TensorRT executable trtexec on server grade GPU Nvidia Tesla and want to run that optimized model on Nvidia Jetson NX.
But from this post, it seems TensorRT optimized model are not only very specific to TensorRT version but also the kind of GPU used to generate that optimized model.

But I still want to know if there is a way to port TensorRT optimized model from server grade gpu to Nvidia Jetson by generating some non-specific optimized model. Or is there any way to simulate Nvidia Jetson NX on Ubuntu server which can output such kind of model.

Thanks in advance for helping me.

Environment

TensorRT Version: 7.1.3.0
GPU Type: NVIDIA Jetson Xavier NX (Developer Kit Version)
Nvidia Driver Version: NA
CUDA Version: 10.2.89
CUDNN Version: 8.0.0.180
JetPack : L4T 32.5.1 [ JetPack 4.5.1 ]
Operating System + Version: Ubuntu 18.04.5 LTS

Hi,
This looks like a Jetson issue. We recommend you to raise it to the respective platform from the below link

Thanks!

Hello @NVES, thanks for the suggestion. I have changed platform of my query.

Hi,

Unfortunately, since the GPU architecture is different (dGPU vs iGPU).
The engine created on Tesla cannot be deployed on the Jetson NX.

You can find some information in our document:
https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#faq

Q: If I build the engine on one GPU and run the engine on another GPU, does this work?

A: We recommend that you don’t; however if you do, you’ll need to follow these guidelines:

  1. The major, minor, and patch versions of TensorRT must match between systems. This ensures you are picking kernels that are still present and have not undergone certain optimizations or bug fixes that would change their behavior.
  2. The CUDA compute capability major and minor versions must match between systems. This ensures that the same hardware features are present so the kernel does not fail to execute. An example would be mixing cards with different precision capabilities.
  3. The following properties should match between systems:
    – Maximum GPU graphics clock speed
    – Maximum GPU memory clock speed
    – GPU memory bus width
    – Total GPU memory
    – GPU L2 cache size
    – SM processor count
    – Asynchronous engine count

If any of the previous properties do not match, you receive the following warning: Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.

If you still want to proceed, then you should build the engine on the smallest SKU in the family because autotuner choices made on smaller GPUs generalize better.

Thanks.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.