I used onnx2trt to generate engine file with batchsize 1. And the inference result is correct. Then I tried setting batchsize to 64 and generate a new engine and do inference. But the output is the same as before. My model is a classification network with 10 classes. With batchsize of 64, I’d expect the output to be of shape (64, 10) and all 64 outputs are the same. However, only the (1, 10) element is correct, the rest 63 are all 0.
This is the inference code I used: https://drive.google.com/file/d/1msXAYG9IbIxY1sLZyXdR52ZAtBSBN161/view?usp=sharing
Can anyone take a look and tell me what is wrong with it? Thanks.
freshly installed Jetpack 4.4 DP