[TensorRT] INTERNAL ERROR: Assertion failed: validateInputsCutensor(src, dst)

Environment
vitual environment from Anaconda:

  • python 3.7.5
  • pytorch 1.3.1
  • tensorrt 6.0.1.5
  • onnx 1.6.0
  • protobuf 3.9.2

with

  • Ubuntu 16.04
  • RTX2080TI on driver 410.79
  • CUDA 10.0
  • cudnn 7.6.3

Problem Description
1. Pytorch2ONNX
Use default opset 9 to convert. Some warnings occur:

TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  x_size = torch.tensor(x.shape)
TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  x_size = torch.tensor(x.shape)
UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch.
ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode).
We recommend using opset 11 and above for models using this operator. 
  "" + str(_export_onnx_opset_version) + ". "

Anyway, it might convert successfully.

onnx.checker.check_model()

passed, and model’s structure, visualized by Netron, looks fine to me.

2. ONNX2TRT

onnx_file.seek(0)
if not parser.parse(onnx_file.read()):
    print("Error:", parser.get_error(0))
    onnx_file.close()

The aboved code I used to parse the onnx model. It seems that TRT parser could parse the onnx model. When I log TRT output by setting trt.Logger(trt.Logger.INFO), TRT could correctly output my onnx model’s content, like

[TensorRT] INFO: 292:Constant -> (4)
[TensorRT] INFO: 293:Upsample -> (1, 3, -1, -1)
[TensorRT] INFO: 294:Conv -> (1, 64, -1, -1)
[TensorRT] INFO: 295:BatchNormalization -> (1, 64, -1, -1)
[TensorRT] INFO: 296:Relu -> (1, 64, -1, -1)
[TensorRT] INFO: 297:MaxPool -> (1, 64, -1, -1)
[TensorRT] INFO: 298:Conv -> (1, 64, -1, -1)
[TensorRT] INFO: 299:BatchNormalization -> (1, 64, -1, -1)
[TensorRT] INFO: 300:Relu -> (1, 64, -1, -1)
[TensorRT] INFO: 301:Conv -> (1, 64, -1, -1)
[TensorRT] INFO: 302:BatchNormalization -> (1, 64, -1, -1)
[TensorRT] INFO: 303:Relu -> (1, 64, -1, -1)
[TensorRT] INFO: 304:Conv -> (1, 256, -1, -1)
[TensorRT] INFO: 305:BatchNormalization -> (1, 256, -1, -1)
[TensorRT] INFO: 306:Conv -> (1, 256, -1, -1)
[TensorRT] INFO: 307:BatchNormalization -> (1, 256, -1, -1)

Saddly, TRT internal error accurs. I totally have no idea about this error.

[TensorRT] INTERNAL ERROR: Assertion failed: validateInputsCutensor(src, dst)
../rtSafe/cuda/cutensorReformat.cpp:226
Aborting...

[TensorRT] ERROR: ../rtSafe/cuda/cutensorReformat.cpp (226) - Assertion Error in executeCutensor: 0 (validateInputsCutensor(src, dst))

Looking forward to your kind response.

Hi,

TRT 6 ONNX Parser isn’t compatible with the ONNX models exported from Pytorch 1.3 - If you downgrade to Pytorch 1.2, this issue should go away.

Will recommend to use TRT 7, it supports pytorch 1.3.
Please refer below link for more details:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-release-notes/tensorrt-7.html#tensorrt-7

Thanks

Thanks for your reply. When I open debug mode by TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE), some ouputs interest me:

Float(1, (# 3 (SHAPE in_frame)), (* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame))), (* 3 (* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame))))) -> Float(1, (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01), (* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01)), (* 3 (* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01))))

The corresponding line from the PyTorch model is:

class Net(nn.Module):
    ...
   def forward(self, x):
      x_resized = F.interpolate(x, size=(128, 128), mode='nearest')
   ...

x is the input of the network, and is a constant size (1x3x256x256). So the debug information can be parsed as:

in_frame     = torch.rand(1, image_channel, image_size, image_size).to(device)
                          0              1           2           3

Float(1,                                                      => 1
(# 3 (SHAPE in_frame)),                                       => image_size
(* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame))),            => image_size * image_size
(* 3 (* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame)))))      => 3 * image_size * image_size

-> 


Float(1,
(ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01),
(* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01)),
(* 3 (* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01))))

The ONNX_RESIZE operator let in_frame from 256x256 to 128x128 by division 5.000000e-01. So the input shape is incorrect.

ONNX2TRT seems not be inappropriate during parsing. I have transfered PyTorch models by TRT PyThon API, and everything goes right. I would close this issue.

Thank you again.