How to use TensorRT engine obtained using tlt-convertor

  1. I have created a custom model using Detectnetv2, exported it using tlt-export. I am using latest TLT 2

  2. On the target machine Docker container for tensorrt (nvcr.io/nvidia/tensorrt:20.03-py3) I have copied tlt-convertor

  3. I run the tlt-convertor
    root@e1186e3b5bbc/root# export API_KEY=
    root@e1186e3b5bbc/root# export OUTPUT_NODES=output_bbox/BiasAdd,output_cov/Sigmoid
    root@e1186e3b5bbc/root# export INPUT_DIMS=3,1920,1080
    root@e1186e3b5bbc/root# export D_TYPE=fp32
    root@e1186e3b5bbc/root# export ENGINE_PATH=engine_bs_1.buf
    root@e1186e3b5bbc/root# export MODEL_PATH=resnet18_detector.etlt
    root@e1186e3b5bbc/root# ll
    total 223900
    drwxr-xr-x 2 root root 4096 May 6 00:09 ./
    drwxrwxr-x 6 1000 1000 4096 May 6 00:02 …/
    -rw-r–r-- 1 root root 4144 Apr 30 06:32 calibration.bin
    -rw-r–r-- 1 root root 132395081 Apr 30 06:31 calibration.tensor
    -rw-r–r-- 1 root root 44874341 Apr 30 06:31 resnet18_detector.etlt
    -rw-r–r-- 1 root root 23863005 Apr 30 06:36 resnet18_detector.trt
    -rw-r–r-- 1 root root 28059357 Apr 30 06:32 resnet18_detector.trt.int8
    -rwxr-xr-x 1 root root 54776 Apr 26 22:00 tlt-converter*
    root@e1186e3b5bbc/root# chmod +x tlt-converter
    root@e1186e3b5bbc/root# ./tlt-converter -k $API_KEY -o $OUTPUT_NODES -d $INPUT_DIMS -e $ENGINE_PATH $MODEL_PATH
    [INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
    [INFO] Detected 1 inputs and 2 output network tensors.

  4. Now I am trying to use the generated engine in the object detection example https://github.com/NVIDIA/object-detection-tensorrt-example. I have modified the models.py to avoid downloading the SSD model and instead use the generated engine.

I am facing issues when I run the
root@e1186e3b5bbc:/home/mike# python ./SSD_Model/detect_objects_webcam.py
TensorRT inference engine settings:

  • Inference precision - DataType.FLOAT
  • Max batch size - 1

Loading cached TensorRT engine from /home/mike/SSD_Model/utils/…/workspace/engines/FLOAT/engine_bs_1.buf
Traceback (most recent call last):
File “./SSD_Model/detect_objects_webcam.py”, line 194, in
main()
File “./SSD_Model/detect_objects_webcam.py”, line 158, in main
batch_size=args.max_batch_size)
File “/home/mike/SSD_Model/utils/inference.py”, line 127, in init
engine_utils.allocate_buffers(self.trt_engine)
File “/home/mike/SSD_Model/utils/engine.py”, line 56, in allocate_buffers
dtype = binding_to_type[str(binding)]
KeyError: ‘input_1’

Please modify your

export INPUT_DIMS=3,1920,1080

to

export INPUT_DIMS=3,1080,1920

and retry?

I tried again after changing the input dims.
export API_KEY=
export OUTPUT_NODES=output_bbox/BiasAdd,output_cov/Sigmoid
export INPUT_DIMS=3,1080,1920
export D_TYPE=fp32
export ENGINE_PATH=engine_bs_1.buf
export MODEL_PATH=resnet18_detector.etlt
./tlt-converter -k $API_KEY -o $OUTPUT_NODES -d $INPUT_DIMS -e $ENGINE_PATH $MODEL_PATH

Still the same error.
root@e1186e3b5bbc:/home/mike# cp -rp SSD_Model/tlt/engine_bs_1.buf SSD_Model/workspace/engines/FLOAT/
root@e1186e3b5bbc:/home/mike# python ./SSD_Model/detect_objects_webcam.pyTensorRT inference engine settings:

  • Inference precision - DataType.FLOAT
  • Max batch size - 1

Loading cached TensorRT engine from /home/mike/SSD_Model/utils/…/workspace/engines/FLOAT/engine_bs_1.buf
Traceback (most recent call last):
File “./SSD_Model/detect_objects_webcam.py”, line 194, in
main()
File “./SSD_Model/detect_objects_webcam.py”, line 158, in main
batch_size=args.max_batch_size)
File “/home/mike/SSD_Model/utils/inference.py”, line 127, in init
engine_utils.allocate_buffers(self.trt_engine)
File “/home/mike/SSD_Model/utils/engine.py”, line 56, in allocate_buffers
dtype = binding_to_type[str(binding)]
KeyError: ‘input_1’

Hi Mike,
By default, the trt engine can be deployed in Deepstream. Could you run it successfully with DS? Refer to https://docs.nvidia.com/metropolis/TLT/tlt-getting-started-guide/index.html#deepstream_deployment.

For your case, test the generated engine in the object detection example https://github.com/NVIDIA/object-detection-tensorrt-example, I will dig out the issue further.