I developed a regression model using CNN.
To lighten the model, I unified the kernel sizes of the first three convolutional layers out of a total of seven (original: 7x7, 5x5, 3x3, … → lightweight: all 3x3).
I then ran inference on Jetson Nano using PyTorch and TensorRT models, but surprisingly, contrary to expectations, the inference speed of the original model was slightly faster than that of the lightweight model.
Model : PyTorch / TensorRT
Original : 7.9ms / 1.6ms
Lightweight : 8.5ms / 1.7ms
I am curious about why this happened.
Thank you for your response.
Hyun-Yong, Kim
Dear @kimhy365 ,
Could you profile layer timings using trtexec in for both models using ONNX to get some insights.
Thanks a lot for your concern.
I directly converted the PyTorch model into TensorRT engine using the torch2trt module.
In both the way, The result that original model is faster than lightweight model was consitent.
I’m looking forward to your nice answer.
There is no update from you for a period, assuming this is not an issue anymore.
Hence, we are closing this topic. If need further support, please open a new one.
Thanks ~0716
Dear @kimhy365 ,
Could you share repro steps to understand the issue?