Performance impact with jit coverted model using by libtorch on Jetson Xavier

We have a pre-trained segementation model with pytorch, to make it work also with libtorch for deployment on product, we used jit to convert the model.

After converting, we compared the performance, and found that the transfer learning speed is decreased(nearly half) with the convered model using libtorch on Jetson Xavier.
So the question is, is this as expected on Jetson Xavier? Any recommanded pratice for converting the model for usage with libtorch from pytorch on Jetson Xavier?

Environment

TensorRT Version : 7.1.3.0
GPU Type : Xavier NX
Nvidia Driver Version : Jetpack 4.4.1
CUDA Version : 10.2.89
CUDNN Version : 8.0.0.180
Operating System + Version : Ubuntu 18.04
PyTorch Version (if applicable) : 1.9
Baremetal or Container (if container which image + tag) :

Hi,

It’s recommended to upgrade your environment to our latest software version first.
Since libtorch is a third-party library, you can get more information from the library owner.

In general, we recommend users convert the model into a TensorRT engine.
It has optimized for the Jetson platform and can give you better performance and more friendly memory usage.

Thanks.

Thanks for the suggestion.

Any recommanded practice for coverting model from pytorch to TensorRT engine?

Hi,

You can check below for an example:

Thanks.

Thanks for the suggestion,

In our application, we’d like to enable the user with ability to re-train the model by annoting some positive/negative area.
However this is not supported by TensorRT?

Hi,

TensorRT is an inference engine so it doesn’t have a training algorithm. (ex. backpropagation).
But you can retrain the model with other frameworks (ex. PyTorch) and convert it into TensorRT after that.

Thanks.

Thanks for reply.

Actually, in our application, we used PyTorch, and when we wanted to integtration it to the product, we used jit tool to convert to libtorch.
And we found that, on Jetson Xavier, the converted model with libtorch is even slower in training when comparing to the original with PyTorch.
And this is very strange, we thought it might be we are not take a good opmization using the cuda resources.

Hi,

Just want to confirm first.

Do you see the slowness on Xaver compared to the desktop?
If yes, do you have a GPU on the desktop as well?

Thanks.

Hi,

Yes, we also see a sloness comparing performance on Xavier and windows desktop, but this is as expected, since we also have a GPU on the workstation desktop environment.

The problem is both on Xavier, we see a reduce of training performance comparing libtorch and pytorch.

BTW: we are considering could you kindly suggestted is any tutorial supporting building libtorch on Jetson Xavier? Thanks!