Poor Inference Time on Jetson TX1

Poor Inference Time on Jetson TX1

Story

I have trained a ResNet18 model on caffe and exported it as .onnx. And now I would like to use this model on TensorRT for inference. As a start, I have implemented a library that uses nvdsinfer.h. Then i used this library to acquire the inference results. However, my goal is to replace a 3rd party library on a legacy project that does the inference, so I have to have the same performance if not better.

Here is the part it gets interesting. I have done a performance analysis on Quadro P4000 and mean inference time is 11ms, however on Jetson TX1 it is 30ms. The 3rd party library that I wish to replace has a performance of 10ms on Quadro P4000 and 12 ms on Jetson TX1.

The project is deployed on Jetson TX1 so I have to take account for the performance comparison of Jetson TX1.

Question

How can my inference library has significantly bad performance on Jetson TX1 when it’s performance was average on Quadro P4000. What can i do to solve this problem? Any guidance will be appreciated.

Note

  • the model is trained on Quadro P4000.

Environment

Module Jetson TX1
TensorRT Version: 7.1.3
GPU Type: NVIDIA Tegra X1

Hi,

This looks like a Jetson issue. Please refer to the below samples in case useful.

For any further assistance, we will move this post to to Jetson related forum.

Thanks!

Are Quadro P4000 and Jetson TX1 comparable?

Well, not separately but in the context i gave, i think they were comparable. Anyways I found the root of the problem which is the fact that I have been doing the pre-processing of the inputs on the cpu and thus getting this kind of measurements.