[TensorRT] INTERNAL ERROR: Assertion failed: validateInputsCutensor(src, dst)

echosmask · December 6, 2019, 1:54am

Environment
vitual environment from Anaconda:

python 3.7.5
pytorch 1.3.1
tensorrt 6.0.1.5
onnx 1.6.0
protobuf 3.9.2

with

Ubuntu 16.04
RTX2080TI on driver 410.79
CUDA 10.0
cudnn 7.6.3

Problem Description
1. Pytorch2ONNX
Use default opset 9 to convert. Some warnings occur:

TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  x_size = torch.tensor(x.shape)

TracerWarning: Converting a tensor to a Python integer might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  x_size = torch.tensor(x.shape)

UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch.
ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode).
We recommend using opset 11 and above for models using this operator. 
  "" + str(_export_onnx_opset_version) + ". "

Anyway, it might convert successfully.

onnx.checker.check_model()

passed, and model’s structure, visualized by Netron, looks fine to me.

2. ONNX2TRT

onnx_file.seek(0)
if not parser.parse(onnx_file.read()):
    print("Error:", parser.get_error(0))
    onnx_file.close()

The aboved code I used to parse the onnx model. It seems that TRT parser could parse the onnx model. When I log TRT output by setting trt.Logger(trt.Logger.INFO), TRT could correctly output my onnx model’s content, like

[TensorRT] INFO: 292:Constant -> (4)
[TensorRT] INFO: 293:Upsample -> (1, 3, -1, -1)
[TensorRT] INFO: 294:Conv -> (1, 64, -1, -1)
[TensorRT] INFO: 295:BatchNormalization -> (1, 64, -1, -1)
[TensorRT] INFO: 296:Relu -> (1, 64, -1, -1)
[TensorRT] INFO: 297:MaxPool -> (1, 64, -1, -1)
[TensorRT] INFO: 298:Conv -> (1, 64, -1, -1)
[TensorRT] INFO: 299:BatchNormalization -> (1, 64, -1, -1)
[TensorRT] INFO: 300:Relu -> (1, 64, -1, -1)
[TensorRT] INFO: 301:Conv -> (1, 64, -1, -1)
[TensorRT] INFO: 302:BatchNormalization -> (1, 64, -1, -1)
[TensorRT] INFO: 303:Relu -> (1, 64, -1, -1)
[TensorRT] INFO: 304:Conv -> (1, 256, -1, -1)
[TensorRT] INFO: 305:BatchNormalization -> (1, 256, -1, -1)
[TensorRT] INFO: 306:Conv -> (1, 256, -1, -1)
[TensorRT] INFO: 307:BatchNormalization -> (1, 256, -1, -1)

Saddly, TRT internal error accurs. I totally have no idea about this error.

[TensorRT] INTERNAL ERROR: Assertion failed: validateInputsCutensor(src, dst)
../rtSafe/cuda/cutensorReformat.cpp:226
Aborting...

[TensorRT] ERROR: ../rtSafe/cuda/cutensorReformat.cpp (226) - Assertion Error in executeCutensor: 0 (validateInputsCutensor(src, dst))

Looking forward to your kind response.

SunilJB · December 22, 2019, 7:58pm

Hi,

TRT 6 ONNX Parser isn’t compatible with the ONNX models exported from Pytorch 1.3 - If you downgrade to Pytorch 1.2, this issue should go away.

Will recommend to use TRT 7, it supports pytorch 1.3.
Please refer below link for more details:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-release-notes/tensorrt-7.html#tensorrt-7

Thanks

echosmask · December 23, 2019, 6:31am

Thanks for your reply. When I open debug mode by TRT_LOGGER = trt.Logger(trt.Logger.VERBOSE), some ouputs interest me:

Float(1, (# 3 (SHAPE in_frame)), (* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame))), (* 3 (* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame))))) -> Float(1, (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01), (* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01)), (* 3 (* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01))))

The corresponding line from the PyTorch model is:

class Net(nn.Module):
    ...
   def forward(self, x):
      x_resized = F.interpolate(x, size=(128, 128), mode='nearest')
   ...

x is the input of the network, and is a constant size (1x3x256x256). So the debug information can be parsed as:

in_frame     = torch.rand(1, image_channel, image_size, image_size).to(device)
                          0              1           2           3

Float(1,                                                      => 1
(# 3 (SHAPE in_frame)),                                       => image_size
(* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame))),            => image_size * image_size
(* 3 (* (# 3 (SHAPE in_frame)) (# 2 (SHAPE in_frame)))))      => 3 * image_size * image_size

-> 


Float(1,
(ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01),
(* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01)),
(* 3 (* (ONNX_RESIZE (# 3 (SHAPE in_frame)) 5.000000e-01) (ONNX_RESIZE (# 2 (SHAPE in_frame)) 5.000000e-01))))

The ONNX_RESIZE operator let in_frame from 256x256 to 128x128 by division 5.000000e-01. So the input shape is incorrect.

ONNX2TRT seems not be inappropriate during parsing. I have transfered PyTorch models by TRT PyThon API, and everything goes right. I would close this issue.

Thank you again.

Topic		Replies	Views
The inference result of Conformer Encoder is wrong TensorRT	6	1239	March 2, 2022
[TensorRT] Converted ONNX model inference error TensorRT tensorrt	5	2070	October 12, 2021
Incorrect slicing of boolean constant tensor with step size >1 TensorRT tensorrt , onnx	6	2992	June 16, 2022
Assertion Failed - inputs.at(1).is_weights() TensorRT pytorch	5	2154	September 29, 2020
Pytorch_onnx_trt strange error TensorRT	5	746	May 14, 2020
Running a pytorch network converted to ONNX with TensorRT on the TX2 Jetson TX2	24	8878	October 18, 2021
ONNX -> TensorRT convertAxis assertion failed TensorRT	15	2869	May 18, 2020
TensorRt Error Network must have at least one output , using onnx model in Jetson nano TensorRT	13	2519	September 24, 2020
Torch.onnx.export with dynamic size for craft TensorRT	6	4026	May 20, 2021
Tensorrt ONNX build engine ERROR Jetson Xavier NX tensorrt	5	1023	October 18, 2021

[TensorRT] INTERNAL ERROR: Assertion failed: validateInputsCutensor(src, dst)

Related topics