Errors while running inference on Peoplenet Model (no training)

I am trying to download the pretrained Peoplenet model and use it to run inference using the TAO toolkit. However I am facing certain errors.

• Hardware - RTX3050
• Network Type - Peoplenet pretrained model
• TAO toolkit Version - 3.21.11

Approach 1

I download the model file in .etlt format and try to use tao converter to generate an engine for inference.

tao converter $LOCAL_EXPERIMENT_DIR/model/peoplenet_vdeployable_quantized_v2.5/resnet34_peoplenet_int8.etlt \
                   -k $KEY \
                   -o output_cov/Sigmoid,output_bbox/BiasAdd \
                   -d 3,544,960 \
                   -i nchw \
                   -m 64 \
                   -e $USER_EXPERIMENT_DIR/experiment_dir_final/peoplenet_detector.trt

This fails with the the following error message -

2021-12-06 09:11:47,394 [INFO] root: Registry: ['nvcr.io']
2021-12-06 09:11:47,430 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
2021-12-06 09:11:47,440 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/sitevisit/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[INFO] [MemUsageChange] Init CUDA: CPU +535, GPU +0, now: CPU 1883, GPU 815 (MiB)
[ERROR] UffParser: Unsupported number of graph 0
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] 4: [network.cpp::validate::2411] Error Code 4: Internal Error (Network must have at least one output)
[ERROR] Unable to create engine
2021-12-06 09:11:48,593 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I have checked the encoding key multiple times, it is set to the default which is tlt_encode. I have not done any training steps, just set up the environment variables and downloaded the model prior to this, so there is no way the key would have changed as well.

Approach 2

I also try to run inference directly on the .etlt model file using the tao detectnet_v2 inference command. Here is the command I am using -

tao detectnet_v2 inference -e $SPECS_DIR/detectnet_v2_inference_kitti_etlt.txt \
                            -o $USER_EXPERIMENT_DIR/tlt_infer_testing \
                            -i $DATA_DOWNLOAD_DIR/testing/image_2 \
                            -k $KEY

I will attach detectnet_v2_inference_kitti_etlt.txt in this post.

The error I am encountering is as follows -

2021-12-06 11:13:39,281 [INFO] root: Registry: ['nvcr.io']
2021-12-06 11:13:39,317 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3
2021-12-06 11:13:39,330 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/sitevisit/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2021-12-06 05:43:43,538 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_inference_kitti_etlt.txt
2021-12-06 05:43:43,540 [INFO] __main__: Overlain images will be saved in the output path.
2021-12-06 05:43:43,540 [INFO] iva.detectnet_v2.inferencer.build_inferencer: Constructing inferencer
Traceback (most recent call last):
  File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py", line 210, in <module>
  File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py", line 206, in main
  File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py", line 103, in inference_wrapper_batch
  File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/inferencer/build_inferencer.py", line 112, in build_inferencer
  File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/inferencer/trt_inferencer.py", line 157, in __init__
AssertionError: Either a valid tensorfile must be present or a cache file.
2021-12-06 11:13:44,253 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

I am not understanding where I am going wrong, as I am just downloading the pretrained model from nvidia and using the exact steps mentioned in the sample notebook.

Please help me modify either of these approaches to be able to infer properly, or suggest what else I can do.

Thanks
detectnet_v2_inference_kitti_etlt.txt (2.6 KB)

For approach 1, can you double check if the etlt file is available?
tao detectnet_v2 run ls $LOCAL_EXPERIMENT_DIR/model/peoplenet_vdeployable_quantized_v2.5/resnet34_peoplenet_int8.etlt

For approach 2, can you refer to the inference spec file in TAO Toolkit Quick Start Guide — TAO Toolkit 3.22.05 documentation

cv_samples_v1.3.0/detectnet_v2/specs/detectnet_v2_inference_kitti_etlt_qat.txt
cv_samples_v1.3.0/detectnet_v2/specs/detectnet_v2_inference_kitti_etlt.txt
cv_samples_v1.3.0/detectnet_v2/specs/detectnet_v2_inference_kitti_tlt.txt

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.