The tlt-converter does not work well with TensorRT 6 (Jetson TX2)

information

• Hardware Platform (Jetson / GPU): Jetson TX2
• DeepStream Version: DeepStream 4.0
• JetPack Version (valid for Jetson only): 4.3-b134
• TensorRT Version: 6.0.1.10-1+cuda10.0

Details

I used nvcr.io/nvidia/tlt-streamanalytics: v1.0.1_py2 to create resnet10_detector.etlt (DetectNet)on x86.
I tried to convert that resnet10_detector.etlt to resnet10_detector.tlt using tlt-converter( https://developer.nvidia.com/tlt-converter-trt60 ) on Jetson TX2.
The following message was output:

./tlt-converter /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/resnet10_detector.etlt \
               -k KEY \
               -c /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/calibration.bin \
               -o output_cov/Sigmoid,output_bbox/BiasAdd \
               -d 3,384,1248 \
               -i nchw \
               -m 64 \
               -t int8 \
               -e /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/resnet10_detector.trt \
               -b 4

[WARNING] Int8 support requested on hardware without native Int8 support, performance will be negatively affected.
[INFO] Reading Calibration Cache for calibrator: EntropyCalibration2
[INFO] Generated calibration scales using calibration cache. Make sure that calibration cache has latest scales.
[INFO] To regenerate calibration cache, please delete the existing one. TensorRT will generate a new calibration cache.
[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[INFO] Detected 1 inputs and 2 output network tensors.
nvidia@nvidia-desktop:/opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app$ deepstream-app -c source8_1080p_dec_infer-resnet_tracker_tiled_display_fp16_tx1.txt 
Creating LL OSD context new
gstnvtracker: Loading low-level lib at /opt/nvidia/deepstream/deepstream-4.0/lib/libnvds_mot_klt.so
gstnvtracker: Optional NvMOT_RemoveStreams not implemented
gstnvtracker: Batch processing is OFF
0:00:02.733800677 11322      0x4ece290 INFO                 nvinfer gstnvinfer.cpp:572:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:useEngineFile(): Loading Model Engine from File
0:00:08.122772674 11322      0x4ece290 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): INVALID_ARGUMENT: Can not find binding of given name
0:00:08.122904609 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:568:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:checkEngineParams(): Could not find output layer 'conv2d_bbox' in engine
0:00:08.122999616 11322      0x4ece290 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): INVALID_ARGUMENT: Can not find binding of given name
0:00:08.123058016 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:568:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:checkEngineParams(): Could not find output layer 'conv2d_cov/Sigmoid' in engine
0:00:08.239817305 11322      0x4ece290 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Could not access labels file '/opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/../../models/Primary_Detector_Nano/labels.txt'
0:00:08.271283261 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:850:gst_nvinfer_start:<primary_gie_classifier> error: Failed to create NvDsInferContext instance
0:00:08.271389629 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:850:gst_nvinfer_start:<primary_gie_classifier> error: Config file path: /opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/config_infer_primary_nano.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
** ERROR: <main:651>: Failed to set pipeline to PAUSED
Quitting
ERROR from primary_gie_classifier: Failed to create NvDsInferContext instance
Debug info: gstnvinfer.cpp(850): gst_nvinfer_start (): /GstPipeline:pipeline/GstBin:primary_gie_bin/GstNvInfer:primary_gie_classifier:
Config file path: /opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/config_infer_primary_nano.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED
App run failed

Could you please tell me what’s causing this?

When using TensorRT 5.0, I was able to convert without any problems and the deepstream-app worked.

using https://developer.nvidia.com/tlt-converter

./tlt-converter /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/resnet10_detector.etlt \
               -k KEY \
               -c /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/calibration.bin \
               -o output_cov/Sigmoid,output_bbox/BiasAdd \
               -d 3,384,1248 \
               -i nchw \
               -m 64 \
               -t int8 \
               -e /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/resnet10_detector.trt \
               -b 4
...
[INFO] Adding reformat layer: conv1/convolution + activation_1/Relu reformatted input 0 (input_1) from Float(1,1248,479232,1437696) to Half(1,1248,479232:2,958464)
[INFO] Adding reformat layer: output_bbox/convolution output to be reformatted 0 (output_bbox/BiasAdd) from Float(1,78,1872,22464) to Half(1,78,1872:2,11232)
[INFO] Adding reformat layer: output_cov/Sigmoid reformatted input 0 (output_cov/BiasAdd) from Half(1,78,1872:2,3744) to Float(1,78,1872,5616)
[INFO] For layer output_cov/Sigmoid a higher-precision implementation was chosen than was requested because it resulted in faster network performance
[INFO] Formats and tactics selection completed in 227.748 seconds.
[INFO] After reformat layers: 19 layers
[INFO] Block size 1073741824
[INFO] Block size 613416960
[INFO] Block size 245366784
[INFO] Block size 245366784
[INFO] Total Activation Memory: 2177892352
[INFO] Detected 1 input and 2 output network tensors.
[INFO] Data initialization and engine generation completed in 0.382303 seconds.

You have generated the trt engine successfully.
Please check if there is a trt engine under folder /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/

If not, please grant access for the folder. Then re-run the tlt-converter.
$ sudo chown nvidia:nvidia /opt/nvidia/deepstream/deepstream-4.0/samples/

Yes.
resnet10_detector.trt has been generated in /opt/nvidia/deepstream/deepstream-4.0/samples/experiment_dir_final/.

However, when I ran deepstream-app with that resnet10_detector.trt, I got the error I described.

0:00:02.733800677 11322      0x4ece290 INFO                 nvinfer gstnvinfer.cpp:572:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:useEngineFile(): Loading Model Engine from File
0:00:08.122772674 11322      0x4ece290 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): INVALID_ARGUMENT: Can not find binding of given name
0:00:08.122904609 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:568:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:checkEngineParams(): Could not find output layer 'conv2d_bbox' in engine
0:00:08.122999616 11322      0x4ece290 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): INVALID_ARGUMENT: Can not find binding of given name
0:00:08.123058016 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:568:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:checkEngineParams(): Could not find output layer 'conv2d_cov/Sigmoid' in engine
0:00:08.239817305 11322      0x4ece290 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:initialize(): Could not access labels file '/opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/../../models/Primary_Detector_Nano/labels.txt'
0:00:08.271283261 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:850:gst_nvinfer_start:<primary_gie_classifier> error: Failed to create NvDsInferContext instance
0:00:08.271389629 11322      0x4ece290 WARN                 nvinfer gstnvinfer.cpp:850:gst_nvinfer_start:<primary_gie_classifier> error: Config file path: /opt/nvidia/deepstream/deepstream-4.0/samples/configs/deepstream-app/config_infer_primary_nano.txt, NvDsInfer Error: NVDSINFER_CONFIG_FAILED

I didn’t think the trt engine was well generated.
What’s causing this error?

Please check Could not access labels file '…/…/models/Primary_Detector_Nano/labels.txt if it is available.

Thank you.
I had misspelled labels.txt.
I fixed that and deep-streamapp worked.

I still get the following error, is it a problem?

Using winsys: x11 
Creating LL OSD context new
0:00:01.939098304 16205     0x176d8c70 INFO                 nvinfer gstnvinfer.cpp:572:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:useEngineFile(): Loading Model Engine from File
0:00:07.460905111 16205     0x176d8c70 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): INVALID_ARGUMENT: Can not find binding of given name
0:00:07.461036567 16205     0x176d8c70 WARN                 nvinfer gstnvinfer.cpp:568:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:checkEngineParams(): Could not find output layer 'conv2d_bbox' in engine
0:00:07.461146487 16205     0x176d8c70 ERROR                nvinfer gstnvinfer.cpp:564:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:log(): INVALID_ARGUMENT: Can not find binding of given name
0:00:07.461198102 16205     0x176d8c70 WARN                 nvinfer gstnvinfer.cpp:568:gst_nvinfer_logger:<primary_gie_classifier> NvDsInferContext[UID 1]:checkEngineParams(): Could not find output layer 'conv2d_cov/Sigmoid' in engine

I am afraid you are using an old version of TLT docker as above.
Suggest you train a model via TLT 2.0_dp or TLT 2.0_py3.

To narrow down, you can try to download an existing model to run in your device which installed TRT6.
Method: How to run tlt-converter

Thank you.
I’ll try it.