Hi,
We encounter some difficulties converting a mask-rcnn model, retrained with tao-toolkit v3.22.05 (on a lambda server) and exported into .etlt format, to engine for use with deepstream 6.0. The backbone was resnet50.
Environment
TensorRT Version: 8.0.1.6 GPU Type: Jetson Xavier AGX 16Gb Nvidia Driver Version: L4T 32.6.1 CUDA Version: 10.2.300 CUDNN Version: 8.2.1.32 Operating System + Version: Jetpack 4.6 Python Version (if applicable): 3.6.9 TensorFlow Version (if applicable): PyTorch Version (if applicable): Baremetal or Container (if container which image + tag):
Steps To Reproduce
The model was trained with input dims of 3x768x1280.
After following this tutorial : MaskRCNN — TAO Toolkit 3.22.05 documentation, (TRT OSS downloaded, libnvinfer_plugin.so.8.0.1 substituted, …), we encounter an error when converting the model from etlt to .trt format:
[ERROR ] 3: fc6/MatMul: kernel weights has count 67108864 but 12845056 was expected
[ERROR ] 3: fc6/MatMul: kernel weights has count 67108864 but 12845056 was expected
[ERROR ] 3: fc6/MatMul: kernel weights has count 67108864 but 12845056 was expected
[ERROR ] UffParser: Parser error: fc6/BiasAdd: The input to the scale layer is required to have a minimum of 3 dimensions
This same error appears when using the tao-converter tool or when the etlt model is directly integrated into Deepstream.
Can you help us solve any of these problems ? Do you have any idea of the source of the problem ?
How did you generate the custom_model.etlt ? Did you ever save the log?
Or if you run all the steps in jupyter notebook, you can share the .ipynb file with me.
# Pull pretrained model from NGC
ngc registry model download-version nvidia/tao/pretrained_instance_segmentation:resnet50 --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet50
So I was able to run the default notebook without any error. Then I compared the default spec file and mine. I found in my spec file that the mrcnn_resolution parameter was set to 64 whereas it was 28 in the original spec file. So I changed it to 28 and I was able to export the model in engine format.
However I don’t know what influence this may have on the quality of the results.
More, could you ever train with “mrcnn_resolution:28” and what is the difference in final mAP between “mrcnn_resolution:64” and with “mrcnn_resolution:28” ?