Segmentation Fault when creating engine file from fp16 mask-rcnn.etlt file

• Hardware Platform (Jetson / GPU): RTX3070, 24GB ram
• DeepStream Version: GitHub - NVIDIA-AI-IOT/deepstream_tao_apps at release/tao3.0
• TensorRT Version: TensorRT=8.0.1
• NVIDIA GPU Driver Version (valid for GPU only): CUDA_VERSION=11.4.1
**• How to reproduce the issue ? **: Running tao-converter command for mask-rcnn-fp16.etlt results in segmentation fault and out of memory errors.

I’ve trained and exported a custom mask-rcnn model from TAO-3.0 toolkit (exported using tao mask_rcnn export in the following container: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3)

When I attempt to convert the fp16.etlt model file to an engine file for DeepStream inference the system fails with a segmentation fault.

The tao-converter command used is:
./tao-converter -b 1 -k nvidia_tlt -t fp16 -m 1 -d 3,832,1344
-o generate_detections,mask_fcn_logits/BiasAdd
/path-to-fp16.etlt

Error Message:
ERROR: [TRT]: 1: Unexpected exception std::bad_alloc
ERROR: …/nvdsinfer/nvdsinfer_model_builder.cpp:1119 Build engine failed from config file
Segmentation fault (core dumped)

Any ideas on resolving this issue?

Where did you download tao-converter? Can you share the link?

Have tried both the base TAO-3.0 Toolkit and the CUDA/CUDNN11.3/8.1 + TensorRT 8.0 tao-converter specific download from here:
https://docs.nvidia.com/tao/tao-toolkit-archive/tao-30-2108/text/tensorrt.html#installing-the-tao-converter

Other Notes:

  • the same fp32 etlt model can be converted to an engine file without issue
  • exporting to int8 crashes in the mask-rcnn TAO notebook

Can you login the docker and retry? Please run below in terminal instead of jupyter notebook.
$ tao mask_rcnn
then,
$ converter xxx (to generate fp32 engine)
$ converter xxx (to generate fp16 engine)
$ converter xxx (to generate int8 engine)

More, for int8, please add "-s " and try again.

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.