This issue is related to getDimensions() and getBindingDimensions() different in host and in Jetson AGX Xavier
So I thought the problem was something wrong with getDimensions() but I’ve found an issue with the Slice layer in onnx to tensorrt:
When exporting a network defined in Pytorch as:
class SNET(nn.Module): def __init__(self): super(SNET, self).__init__() self.conv1 = nn.Conv2d(1, 1, 3) def forward(self, x): x = self.conv1(x) return x
With an input size of 100x100 the exported onnx layer looks like this (visualized using Netron)
I’ve modified the sample code for OnnxMnist from tensorrt just to parse the network and output the input and output sizes. I’ve attached it to the issue related but I’m adding a simpler version here to reproduce easier. The network parses ok with no error and the input and output sizes are ok, as the following picture shows:
Now, if I add a Slice layer like this on Pytorch:
class SNET(nn.Module): def __init__(self): super(SNET, self).__init__() self.conv1 = nn.Conv2d(1, 1, 3) def forward(self, x): x = self.conv1(x) return x[:, :, 24:74, 24:74]
The exported onnx looks like this:
which of course also makes sense, now when parsing this network the input dimensions are correct but the output dimensions are wrong, as you can see here and also test yourself.
There is no error codes, no warning (except from the ir_version, but I’ve tested different ones and it’s persistent, also on pytorch1.1 used to export which is the one recommended on the documentation for this trt version), nothing, just the wrong dimensions, and of course the network does not work. I’ve tested multiple things before confirming this. I’ve reproduced on Xavier with Jetpack4.3 and also in a nvidia docker image from ngc. Very tricky behavior and cause of a big headache. I’m tagging @AastaLLL since he saw the other issue related.
You can download the code here https://drive.google.com/open?id=1GUw5pWP73Ej_FmdxjhZVvtwzuIXdtxXh with both sample nets included. I’ll test on TRT7 to see if the behavior is the same, but I remember this issue did not happen on trt5
Jetpack 4.3 on Xavier
Docker Image from NGC:
docker pull nvcr.io/nvidia/tensorrt:19.12-py3
docker run --gpus all -it --rm -v /yourVOLUME:/workspace/smpOnnx nvcr.io/nvidia/tensorrt:19.12-py3
Steps To Reproduce
Make project and run, change
sampleOnnx.cpp line 211 from :
params.onnxFileName = "snet_slice.onnx";
params.onnxFileName = "snet_noslice.onnx";
to test both behaviors.