I am running an inference with ResNet50 using TensorRT on Python with Jetpack version 5.0.2-b231 on Jetson AGX Xavier. I am processing a variable number of detections to extract features so that the engine has been generated with dynamic batch from an ONNX model with variable input and output. The problem is that using dynamic batch makes the process much slower using TensorRT than using the original PyTorch model. You can find the original model here: GitHub - HobbitLong/SupContrast: PyTorch implementation of "Supervised Contrastive Learning" (and SimCLR incidentally)
I would like to know if there is any way to get higher performance using dynamic shape with TensorRT.
Could you share performance data you got with TensorRT and PyTorch?
And the detailed steps to reproduce the score so we can check it in our environment as well.
The inference time with PyTorch is about 63 ms and with TensorRT is about 686 ms per frame
Every frame has about 10 detections, so that I have created the engine with minShapes, maxShapes and optShapes parameters:
Yes, I am using the model shared on the repository. You must follow exactly the same steps that I showed in my previous post, loading the model using the repository network definition.
I shared with you the ONNX model in a zipped folder and I think you can convert it to TensorRT directly.
Thanks for your patience.
Could you verify if the PyTorch inference is also using batch size=8?
We test the ONNX model with TensorRT and ONNXRuntime on Xavier.
In TensorRT, we got 84.6757ms for batchsize=1 and 652.653ms for batchsize=8.
In ONNXRuntime, batchsize=1 takes 94.296ms while batchsize=8 needs 684.891ms.
So it looks like the performance difference comes from the different batch sizes used.
Yes, PyTorch inference is also using batch size=8.
I agree with you in TensorRT performance, I get the same time for batch size =1 and batch size=8, but the question is why is the process using TensorRT so much slower than using PyTorch?
And finally, as I said in my first post: Is there is any way to get higher performance using dynamic shape with TensorRT?
Have you tried that latest model and latest source shared in the repository?
It doesn’t work since the error mentioned in the Oct 26.
size mismatch for encoder.conv1.weight: copying a param with shape torch.Size([64, 3, 7, 7]) from checkpoint, the shape in current model is torch.Size([64, 3, 3, 3]).
Based on your implementation, does the images contains 8 image that size is 3x224x224?
Sorry, I was wrong about the code. I did not remember that I had changed the code to make it work. You must change self.shortcut by self.downsample in resnet_big.py file and kernel_size in line 80 by 7. I hope it works.
Thanks for the hint.
We can run the model with PyTorch after the change you mentioned.
Below is the performance data that we test for batch=1 and batch=8.
It seems that TensorRT give a better performance compared to ONNXRuntime or PyTorch.