I have trained a ResNet18 model on caffe and exported it as .onnx. And now I would like to use this model on TensorRT for inference. As a start, I have implemented a library that uses nvdsinfer.h. Then i used this library to acquire the inference results. However, my goal is to replace a 3rd party library on a legacy project that does the inference, so I have to have the same performance if not better.
Here is the part it gets interesting. I have done a performance analysis on Quadro P4000 and mean inference time is 11ms, however on Jetson TX1 it is 30ms. The 3rd party library that I wish to replace has a performance of 10ms on Quadro P4000 and 12 ms on Jetson TX1.
The project is deployed on Jetson TX1 so I have to take account for the performance comparison of Jetson TX1.
How can my inference library has significantly bad performance on Jetson TX1 when it’s performance was average on Quadro P4000. What can i do to solve this problem? Any guidance will be appreciated.
- the model is trained on Quadro P4000.
Module Jetson TX1
TensorRT Version: 7.1.3
GPU Type: NVIDIA Tegra X1