Description
I use the following code to get onnx file and trtexec (trtexec --onnx=tmp.onnx --fp16
) to get trt file. Then a problem arose.
In the code, conv kernel is a dynamic input so I cannot replace it with nn.Conv2d. It seems that tensorrt only supports fixed kernel. Is there any solution to deal with a dynamic kernel when converting F.conv2d to tensorrt? I would be very grateful if any help is provided.
import torch
import torch.nn as nn
import torch.nn.functional as F
class Conv(nn.Module):
def __init__(self):
super(Conv, self).__init__()
def forward(self, x, kernel):
return F.conv2d(x, kernel, groups=256)
model = Conv()
dummy_input = (torch.randn([1, 256, 21, 21]), torch.randn([256, 1, 4, 4]))
print(model(*dummy_input).size())
model_path = 'tmp.onnx'
torch.onnx.export(model, dummy_input, model_path, verbose=True, export_params=True)`
The problem is as follows:
[09/06/2021-15:10:02] [E] [TRT] Conv_0: kernel weights has count 0 but 4096 was expected
[09/06/2021-15:10:02] [E] [TRT] Conv_0: count of 0 weights in kernel, but kernel dimensions (4,4) with 256 input channels, 256 output channels and 256 groups were specified. Expected Weights count is 256 * 4*4 * 256 / 256 = 4096
Environment
TensorRT Version : 7.1.3.4
GPU Type : V100
Nvidia Driver Version : 440.33.01
CUDA Version : 10.2
CUDNN Version : 8.0.2
Operating System + Version : CentOS
Python Version (if applicable) : 3.6
PyTorch Version (if applicable) : 1.6
NVES
September 6, 2021, 1:20pm
2
Hi,
Request you to share the ONNX model and the script if not shared already so that we can assist you better.
Alongside you can try few things:
validating your model with the below snippet
check_model.py
import sys
import onnx
filename = yourONNXmodel
model = onnx.load(filename)
onnx.checker.check_model(model).
2) Try running your model with trtexec command.
https://github.com/NVIDIA/TensorRT/tree/master/samples/opensource/trtexec
In case you are still facing issue, request you to share the trtexec “”–verbose"" log for further debugging
Thanks!
Hi,
Thanks for your reply, here is the verbose log after checking the onnx model .
&&&& RUNNING TensorRT.trtexec # /home/public/tensorrt/TensorRT-7.1.3.4/bin/trtexec --onnx=tmp.onnx --verbose --fp16
[09/07/2021-16:59:14] [I] === Model Options ===
[09/07/2021-16:59:14] [I] Format: ONNX
[09/07/2021-16:59:14] [I] Model: tmp.onnx
[09/07/2021-16:59:14] [I] Output:
[09/07/2021-16:59:14] [I] === Build Options ===
[09/07/2021-16:59:14] [I] Max batch: 1
[09/07/2021-16:59:14] [I] Workspace: 16 MB
[09/07/2021-16:59:14] [I] minTiming: 1
[09/07/2021-16:59:14] [I] avgTiming: 8
[09/07/2021-16:59:14] [I] Precision: FP32+FP16
[09/07/2021-16:59:14] [I] Calibration:
[09/07/2021-16:59:14] [I] Safe mode: Disabled
[09/07/2021-16:59:14] [I] Save engine:
[09/07/2021-16:59:14] [I] Load engine:
[09/07/2021-16:59:14] [I] Builder Cache: Enabled
[09/07/2021-16:59:14] [I] NVTX verbosity: 0
[09/07/2021-16:59:14] [I] Inputs format: fp32:CHW
[09/07/2021-16:59:14] [I] Outputs format: fp32:CHW
[09/07/2021-16:59:14] [I] Input build shapes: model
[09/07/2021-16:59:14] [I] Input calibration shapes: model
[09/07/2021-16:59:14] [I] === System Options ===
[09/07/2021-16:59:14] [I] Device: 0
[09/07/2021-16:59:14] [I] DLACore:
[09/07/2021-16:59:14] [I] Plugins:
[09/07/2021-16:59:14] [I] === Inference Options ===
[09/07/2021-16:59:14] [I] Batch: 1
[09/07/2021-16:59:14] [I] Input inference shapes: model
[09/07/2021-16:59:14] [I] Iterations: 10
[09/07/2021-16:59:14] [I] Duration: 3s (+ 200ms warm up)
[09/07/2021-16:59:14] [I] Sleep time: 0ms
[09/07/2021-16:59:14] [I] Streams: 1
[09/07/2021-16:59:14] [I] ExposeDMA: Disabled
[09/07/2021-16:59:14] [I] Spin-wait: Disabled
[09/07/2021-16:59:14] [I] Multithreading: Disabled
[09/07/2021-16:59:14] [I] CUDA Graph: Disabled
[09/07/2021-16:59:14] [I] Skip inference: Disabled
[09/07/2021-16:59:14] [I] Inputs:
[09/07/2021-16:59:14] [I] === Reporting Options ===
[09/07/2021-16:59:14] [I] Verbose: Enabled
[09/07/2021-16:59:14] [I] Averages: 10 inferences
[09/07/2021-16:59:14] [I] Percentile: 99
[09/07/2021-16:59:14] [I] Dump output: Disabled
[09/07/2021-16:59:14] [I] Profile: Disabled
[09/07/2021-16:59:14] [I] Export timing to JSON file:
[09/07/2021-16:59:14] [I] Export output to JSON file:
[09/07/2021-16:59:14] [I] Export profile to JSON file:
[09/07/2021-16:59:14] [I]
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::GridAnchor_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::NMS_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::Reorg_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::Region_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::Clip_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::LReLU_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::PriorBox_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::Normalize_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::RPROI_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::BatchedNMS_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::FlattenConcat_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::CropAndResize version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::DetectionLayer_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::Proposal version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::ProposalLayer_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::PyramidROIAlign_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::ResizeNearest_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::Split version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::SpecialSlice_TRT version 1
[09/07/2021-16:59:14] [V] [TRT] Registered plugin creator - ::InstanceNormalization_TRT version 1
----------------------------------------------------------------
Input filename: tmp.onnx
ONNX IR version: 0.0.6
Opset version: 9
Producer name: pytorch
Producer version: 1.6
Domain:
Model version: 0
Doc string:
----------------------------------------------------------------
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::GridAnchor_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::NMS_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::Reorg_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::Region_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::Clip_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::LReLU_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::PriorBox_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::Normalize_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::RPROI_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::BatchedNMS_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::FlattenConcat_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::CropAndResize version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::DetectionLayer_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::Proposal version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::ProposalLayer_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::PyramidROIAlign_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::ResizeNearest_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::Split version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::SpecialSlice_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] Plugin creator already registered - ::InstanceNormalization_TRT version 1
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:202: Adding network input: 0 with dtype: float32, dimensions: (1, 256, 21, 21)
[09/07/2021-16:59:15] [V] [TRT] ImporterContext.hpp:116: Registering tensor: 0 for ONNX tensor: 0
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:202: Adding network input: 1 with dtype: float32, dimensions: (256, 1, 4, 4)
[09/07/2021-16:59:15] [V] [TRT] ImporterContext.hpp:116: Registering tensor: 1 for ONNX tensor: 1
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:103: Parsing node: Conv_0 [Conv]
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:119: Searching for input: 0
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:119: Searching for input: 1
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:125: Conv_0 [Conv] inputs: [0 -> (1, 256, 21, 21)], [1 -> (256, 1, 4, 4)],
[09/07/2021-16:59:15] [V] [TRT] Kernel weights are not set yet. Kernel weights must be set using setInput(1, kernel_tensor) API call.
[09/07/2021-16:59:15] [E] [TRT] Parameter check failed at: ../builder/Layers.cpp::setInput::555, condition: mNetwork->hasExplicitPrecision() ? index <= 1 : index <= 0
[09/07/2021-16:59:15] [V] [TRT] ImporterContext.hpp:141: Registering layer: Conv_0 for ONNX node: Conv_0
[09/07/2021-16:59:15] [E] [TRT] Conv_0: kernel weights has count 0 but 4096 was expected
[09/07/2021-16:59:15] [E] [TRT] Conv_0: count of 0 weights in kernel, but kernel dimensions (4,4) with 256 input channels, 256 output channels and 256 groups were specified. Expected Weights count is 256 * 4*4 * 256 / 256 = 4096
[09/07/2021-16:59:15] [V] [TRT] ImporterContext.hpp:116: Registering tensor: 3_1 for ONNX tensor: 3
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:179: Conv_0 [Conv] outputs: [3 -> ()],
[09/07/2021-16:59:15] [V] [TRT] ModelImporter.cpp:507: Marking 3_1 as output: 3
----- Parsing of ONNX model tmp.onnx is Done ----
[09/07/2021-16:59:15] [W] [TRT] Unused Input: 1
[09/07/2021-16:59:15] [E] [TRT] Conv_0: kernel weights has count 0 but 4096 was expected
[09/07/2021-16:59:15] [E] [TRT] Conv_0: count of 0 weights in kernel, but kernel dimensions (4,4) with 256 input channels, 256 output channels and 256 groups were specified. Expected Weights count is 256 * 4*4 * 256 / 256 = 4096
[09/07/2021-16:59:15] [E] [TRT] Conv_0: kernel weights has count 0 but 4096 was expected
[09/07/2021-16:59:15] [E] [TRT] Conv_0: count of 0 weights in kernel, but kernel dimensions (4,4) with 256 input channels, 256 output channels and 256 groups were specified. Expected Weights count is 256 * 4*4 * 256 / 256 = 4096
[09/07/2021-16:59:15] [E] [TRT] Layer Conv_0 failed validation
[09/07/2021-16:59:15] [E] [TRT] Network validation failed.
[09/07/2021-16:59:15] [E] Engine creation failed
[09/07/2021-16:59:15] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec # /home/public/tensorrt/TensorRT-7.1.3.4/bin/trtexec --onnx=tmp.onnx --verbose --fp16
Hi @xingxing-123 ,
That path is only implemented for Q/DQ support in INT8. Please refer
https://docs.nvidia.com/deeplearning/tensorrt/best-practices/index.html#qdq-fusion
We are unable to access onnx model due to no permission.
Thank you.
Hi,
Currently there is no way to provide kernel as input tensor. This feature may be provided in the future releases.
At this point we would recommend you to serialize such tensor offline. and set them as parameter to a conv later. Hope this helps.
Thank you.
Eric_W
February 8, 2023, 8:40pm
7
Hi @spolisetty , do you know if there have been any updates on this? Thanks!