Tao-converter mask_rcnn int8 engine creation fails

jeffmartin · October 5, 2021, 11:38pm

Please provide the following information when requesting support.

• Hardware A40
• Network Type Mask_rcnn
• TLT Version tao-toolkit-tf:v3.21.08-py3
Deploying on tensorrt container 21.08, with opensource plugin install script
/opt/tensorrt/install_opensource.sh -b 21.08

peoplenetPruned.txt (2.0 KB)

• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

I am attempting to deploy a mask_rcnn “peopleSegnet” exported model using tao-converter. The task completes successfully with fp32 and fp16 but fails with int8. I use the following command:
./tao-converter -d 3,576,960 \

-k nvidia_tlt \

-o generate_detections,mask_fcn_logits/BiasAdd \

-c /workspace/bantam/peopleNet/100221Calibration.cache \

-e int8.engine \

-b 6 \

-m 6 \

-w 15000000000 \

-t int8 \

/workspace/bantam/peopleNet/pruned100221_exported.etlt

I get the following message:
[ERROR] 1: Unexpected exception std::bad_alloc
[ERROR] Unable to create engine
Segmentation fault (core dumped)

If I increase the workspace size to 25,000,000,000 I get this error:

[WARNING] Memory requirements of format conversion cannot be satisfied during timing, format rejected.
[WARNING] Internal error: cannot reformat, disabling format. Try decreasing the workspace size with IBuilderConfig::setMaxWorkspaceSize().

Morganh · October 6, 2021, 3:21pm

Can you try
$ ./tao-converter -k nvidia_tlt -d 3,576,960 -o generate_detections,mask_fcn_logits/BiasAdd -t int8 -c peoplesegnet_resnet50_int8.txt -m 1 -w 100000000 peoplesegnet_resnet50.etlt

jeffmartin · October 6, 2021, 6:18pm

I receive:

[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 2985, GPU 1582 (MiB)
[ERROR] 1: Unexpected exception std::bad_alloc
[ERROR] Unable to create engine
Segmentation fault (core dumped)

I am using tao-converter for CUDA 11.3 / cuDNN 8.1 / TensorRT 8.0 - is this correct for container nvcr.io/nvidia/tensorrt:21.08-py3?

Morganh · October 7, 2021, 3:20pm

So, you were downloading tao-converter inside container nvcr.io/nvidia/tensorrt:21.08-py3 and then generate trt engine?

jeffmartin · October 7, 2021, 3:39pm

yes

Morganh · October 7, 2021, 3:48pm

Can you run below inside nvcr.io/nvidia/tensorrt:21.08-py3 to check the cuda/trt/cudnn version?
$ dpkg -l |grep cuda

jeffmartin · October 7, 2021, 3:57pm

ii cuda-cccl-11-4 ii cuda-compat-11-4 ii cuda-cudart-11-4 ii cuda-cudart-dev-11-4 ii cuda-cuobjdump-11-4 ii cuda-cupti-11-4 ii cuda-cupti-dev-11-4 ii cuda-driver-dev-11-4 ii cuda-gdb-11-4 ii cuda-memcheck-11-4 ii cuda-nvcc-11-4 ii cuda-nvdisasm-11-4 ii cuda-nvml-dev-11-4 ii cuda-nvprof-11-4 ii cuda-nvprune-11-4 ii cuda-nvrtc-11-3 ii cuda-nvrtc-11-4 ii cuda-nvrtc-dev-11-3 ii cuda-nvrtc-dev-11-4 ii cuda-nvtx-11-4 ii cuda-sanitizer-11-4 ii cuda-toolkit-11-4-config-common ii cuda-toolkit-11-config-common ii cuda-toolkit-config-common ii libcudnn8 ii libcudnn8-dev ii libnccl-dev ii libnccl2 ii libnvinfer-bin ii libnvinfer-dev ii libnvinfer-plugin-dev ii libnvinfer-plugin8 ii libnvinfer8 ii libnvonnxparsers-dev ii libnvonnxparsers8 ii libnvparsers-dev ii libnvparsers8 11.4.43-1 amd64 CUDA CCCL
470.57.02-1 amd64 CUDA Compatibility Platform
11.4.108-1 amd64 CUDA Runtime native Libraries
11.4.108-1 amd64 CUDA Runtime native dev links, headers
11.4.43-1 amd64 CUDA cuobjdump
11.4.100-1 amd64 CUDA profiling tools runtime libs.
11.4.100-1 amd64 CUDA profiling tools interface.
11.4.108-1 amd64 CUDA Driver native dev stub library
11.4.100-1 amd64 CUDA-GDB
11.4.100-1 amd64 CUDA-MEMCHECK
11.4.100-1 amd64 CUDA nvcc
11.4.100-1 amd64 CUDA disassembler
11.4.43-1 amd64 NVML native dev links, headers
11.4.100-1 amd64 CUDA Profiler tools
11.4.100-1 amd64 CUDA nvprune
11.3.109-1 amd64 NVRTC native runtime libraries
11.4.100-1 amd64 NVRTC native runtime libraries
11.3.109-1 amd64 NVRTC native dev links, headers
11.4.100-1 amd64 NVRTC native dev links, headers
11.4.100-1 amd64 NVIDIA Tools Extension
11.4.108-1 amd64 CUDA Sanitizer
11.4.108-1 all Common config package for CUDA Toolkit 11.4.
11.4.108-1 all Common config package for CUDA Toolkit 11.
11.4.108-1 all Common config package for CUDA Toolkit.
8.2.2.26-1+cuda11.4 amd64 cuDNN runtime libraries
8.2.2.26-1+cuda11.4 amd64 cuDNN development libraries and headers
2.10.3-1+cuda11.4 amd64 NVIDIA Collective Communication Library (NCCL) Development Files
2.10.3-1+cuda11.4 amd64 NVIDIA Collective Communication Library (NCCL) Runtime
8.0.1-1+cuda11.3 amd64 TensorRT binaries
8.0.1-1+cuda11.3 amd64 TensorRT development libraries and headers
8.0.1-1+cuda11.3 amd64 TensorRT plugin libraries and headers
8.0.1-1+cuda11.3 amd64 TensorRT plugin libraries
8.0.1-1+cuda11.3 amd64 TensorRT runtime libraries
8.0.1-1+cuda11.3 amd64 TensorRT ONNX libraries
8.0.1-1+cuda11.3 amd64 TensorRT ONNX libraries
8.0.1-1+cuda11.3 amd64 TensorRT parsers libraries
8.0.1-1+cuda11.3 amd64 TensorRT parsers libraries

Morganh · October 11, 2021, 1:25am

Where did you download tao-converter? Can you share the link?

jeffmartin · October 11, 2021, 2:12am

from the page https://developer.nvidia.com/tao-toolkit-get-started
I used https://developer.nvidia.com/tao-converter-80

Morganh · October 11, 2021, 9:42am

How about generating fp16 trt engine? Is it successful?

More, please also try the official demo models mentioned in GitHub - NVIDIA-AI-IOT/deepstream_tao_apps at release/tao3.0 .

models : To download sample models that are trained by trained by NVIDIA Train, Adapt, and Optimize(TAO) Toolkit SDK run wget https://nvidia.box.com/shared/static/i1cer4s3ox4v8svbfkuj5js8yqm3yazo.zip -O models.zip

jeffmartin · October 11, 2021, 11:49am

FP16 and FP32 generate successfully and provide accurate inference.

system · November 2, 2021, 2:44am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Tao-converter export int8 engine core dump on Xavier NX jetpack 4.6 TAO Toolkit tao	9	1285	February 8, 2022
Segmentation Fault when creating engine file from fp16 mask-rcnn.etlt file TAO Toolkit	6	506	July 6, 2022
Using TRT 8.5.1 on RTX 4070 Ti - Unsupported SM Error TAO Toolkit	9	866	August 22, 2023
Tao-converter error TAO Toolkit	34	1964	November 10, 2021
Tlt_converter cannot find libnvinfer.so.7 TAO Toolkit	2	459	October 13, 2022
Mask-RCNN int8 Version Results in Poor Performance TAO Toolkit	37	1004	July 6, 2022
LPRNet can't use exported engine file TAO Toolkit	18	2508	December 28, 2021
Cannot run tao-converter on DS6.1 TAO Toolkit deepstream61	9	381	July 11, 2022
Failed to convert TensorRT engine using Tao deploy library TAO Toolkit	7	425	November 7, 2023
Docker instantiation failed when running tao ssd TAO Toolkit	17	928	December 28, 2021

Tao-converter mask_rcnn int8 engine creation fails

Related topics