Trouble using trt-infer on peoplenet pretrained model

I have been trying to run inference on the peoplenet pretrained etlt models but running into strange errors.

root@139aee1557cf:/workspace/projects/ava/granada/peopleNet/tlt_peoplenet_pruned_v2.0# ll
total 45220
drwxr-xr-x 3 1001 1001 4096 Nov 11 22:19 ./
drwxrwxr-x 7 1001 1001 4096 Nov 11 20:43 …/
-rw-r–r-- 1 1001 1001 1887 Nov 11 22:19 config_infer_primary_peoplenet.txt
-rw-r–r-- 1 1001 1001 3398 Nov 10 15:49 deepstream_app_source1_peoplenet.txt
-rw-rw-r-- 1 1001 1001 4557 Nov 11 18:08 inference.py
-rw-rw-r-- 1 1001 1001 499 Nov 9 16:31 inference_working.py
-rw-r–r-- 1 1001 1001 16 Aug 3 20:25 labels.txt
-rw-r–r-- 1 root root 16 Nov 11 22:19 labels_peoplenet.txt
-rw-r–r-- 1 root root 23658347 Nov 11 22:04 peoplenet.engine
-rw-r–r-- 1 1001 1001 7784 Nov 11 21:21 peoplenet.py
-rw-r–r-- 1 1001 1001 5297 Nov 9 22:57 peoplenet.txt
-rw-r–r-- 1 1001 1001 1404 Aug 3 20:25 resnet18_peoplenet_int8.txt
-rw-r–r-- 1 1001 1001 1404 Aug 3 20:25 resnet18_peoplenet_int8_dla.txt
-rw-r–r-- 1 1001 1001 11420239 Aug 3 20:25 resnet18_peoplenet_pruned.etlt
-rw-r–r-- 1 1001 1001 2747 Aug 3 20:25 resnet34_peoplenet_int8.txt
-rw-r–r-- 1 1001 1001 9376 Aug 3 20:25 resnet34_peoplenet_int8_dla.txt
-rw-r–r-- 1 1001 1001 10474079 Aug 3 20:25 resnet34_peoplenet_pruned.etlt
drwxr-xr-x 6 1001 1001 4096 Nov 6 20:00 samples/
-rw-r–r-- 1 1001 1001 0 Nov 11 22:05 test.png
-rw-r–r-- 1 1001 1001 662412 Nov 6 19:40 zoom.png
root@139aee1557cf:/workspace/projects/ava/granada/peopleNet/tlt_peoplenet_pruned_v2.0# tlt-infer detectnet_v2 -e config_infer_primary_peoplenet.txt -i zoom.png -o test.png -k tlt_encode
Using TensorFlow backend.
2020-11-11 22:21:19.919683: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Traceback (most recent call last):
File “/usr/local/bin/tlt-infer”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_infer.py”, line 54, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py”, line 187, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/spec_handler/spec_loader.py”, line 88, in load_inference_spec
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/spec_handler/spec_loader.py”, line 50, in load_proto
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/spec_handler/spec_loader.py”, line 36, in _load_from_file
File “/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py”, line 734, in Merge
allow_unknown_field=allow_unknown_field)
File “/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py”, line 802, in MergeLines
return parser.MergeLines(lines, message)
File “/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py”, line 827, in MergeLines
self._ParseOrMerge(lines, message)
File “/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py”, line 849, in _ParseOrMerge
self._MergeField(tokenizer, message)
File “/usr/local/lib/python3.6/dist-packages/google/protobuf/text_format.py”, line 895, in _MergeField
message_descriptor.full_name)
google.protobuf.text_format.ParseError: 23:2 : Message type “Inference” does not have extensions.

Please see config config_infer_primary_peoplenet.txt (1.8 KB) file attached.

The spec file when you run tlt-infer is not correct.
Please refer to jupyter notebook.

# Running inference for detection on n images
!tlt-infer detectnet_v2 -e $SPECS_DIR/detectnet_v2_inference_kitti_tlt.txt
-o $USER_EXPERIMENT_DIR/tlt_infer_testing
-i $DATA_DOWNLOAD_DIR/testing/image_2
-k $KEY

You can check the detectnet_v2_inference_kitti_tlt.txt inside the docker.

@Morganh Thanks for your input. I am using the attached file detectnet_v2_inference_kitti_etlt.txt (2.1 KB) and I am getting a failure to reshape array error now.

root@139aee1557cf:/workspace/projects/peopleNet/tlt_peoplenet_pruned_v2.0# tlt-infer detectnet_v2 -e detectnet_v2_inference_kitti_etlt.txt -i zoom.png -o test.png -k tlt_encode
Using TensorFlow backend.
2020-11-12 13:53:58.101802: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-12 13:54:00,945 [INFO] iva.detectnet_v2.scripts.inference: Overlain images will be saved in the output path.
2020-11-12 13:54:00,945 [INFO] iva.detectnet_v2.inferencer.build_inferencer: Constructing inferencer
2020-11-12 13:54:01,681 [INFO] iva.detectnet_v2.inferencer.trt_inferencer: Reading from engine file at: peoplenet.engine
2020-11-12 13:54:02,807 [INFO] iva.detectnet_v2.scripts.inference: Initialized model
2020-11-12 13:54:02,807 [INFO] iva.detectnet_v2.scripts.inference: Commencing inference
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File “/usr/local/bin/tlt-infer”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_infer.py”, line 54, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py”, line 194, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py”, line 150, in inference_wrapper_batch
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/inferencer/trt_inferencer.py”, line 443, in infer_batch
ValueError: could not broadcast input array from shape (4,544,960) into shape (3,544,960)

I generated the peoplenet.engine file with

tlt-converter -k tlt_encode -o output_bbox/BiasAdd,output_cov/Sigmoid -d 3,554,960 -e peoplenet.engine -t fp32 -w 6144 resnet34_peoplenet_pruned.etlt

Note that I keep getting the following message regardless of the workspace size. Please let me know if the usage is incorrect.

[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.

Please modify your detectnet_v2_inference_kitti_etlt.txt

Below is not correct. You did not train below classes,right?

target_classes: “car”
target_classes: “cyclist”
target_classes: “pedestrian”

Thanks for pointing it out. I updated the target classes and now I am running into a different issue. Please see the error below. Can you suggest if there is any preprocessing needed on the image files?

tlt_peoplenet_pruned_v2.0# tlt-infer detectnet_v2 -e detectnet_v2_inference_kitti_etlt.txt -i zoom.png -o test.png -k tlt_encode
Using TensorFlow backend.
2020-11-13 16:03:46.236092: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-13 16:03:48,297 [INFO] iva.detectnet_v2.scripts.inference: Overlain images will be saved in the output path.
2020-11-13 16:03:48,297 [INFO] iva.detectnet_v2.inferencer.build_inferencer: Constructing inferencer
2020-11-13 16:03:48,552 [INFO] iva.detectnet_v2.inferencer.trt_inferencer: Reading from engine file at: peoplenet.engine
2020-11-13 16:03:49,390 [INFO] iva.detectnet_v2.scripts.inference: Initialized model
2020-11-13 16:03:49,390 [INFO] iva.detectnet_v2.scripts.inference: Commencing inference
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File “/usr/local/bin/tlt-infer”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_infer.py”, line 54, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py”, line 194, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py”, line 150, in inference_wrapper_batch
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/inferencer/trt_inferencer.py”, line 443, in infer_batch
ValueError: could not broadcast input array from shape (4,544,960) into shape (3,544,960)

Please paste your latest detectnet_v2_inference_kitti_etlt.txt.

I also find a mismatching in your previous config file.
Please modify

key:“Person”
value: {
confidence_model: “aggregate_cov”
output_map: “car”

to

key:“Person”
value: {
confidence_model: “aggregate_cov”
output_map: “Person”

Need to modify the other two classes too.

Thanks for your responses. I corrected the output_map but the input array reshape error still persists. Please find the detectnet_v2_inference_kitti_etlt.txt detectnet_v2_inference_kitti_etlt.txt (1.7 KB) attached.

I cannot reproduce your issue.

My step:

  1. $ wget https://api.ngc.nvidia.com/v2/models/nvidia/tlt_peoplenet/versions/pruned_v2.0/files/resnet18_peoplenet_pruned.etlt
  2. $ tlt-converter resnet18_peoplenet_pruned.etlt -k tlt_encode -o output_cov/Sigmoid,output_bbox/BiasAdd -d 3,544,960 -i nchw -e peoplenet_fp32.engine -m 64 -t fp32 -b 64
  3. tlt-infer detectnet_v2 -e forum_topic_159352_etlt.txt -o result_infer_its -k tlt_encode -i 1.jpg

Note, the forum_topic_159352_etlt.txt is the same as your attachment last comment.

I also observe that you generate trt engine with above command. But it should be 3,544,960 instead of 3,554,960 . Please check if it is the culprit.