I am trying to download the pretrained Peoplenet model and use it to run inference using the TAO toolkit. However I am facing certain errors.
• Hardware - RTX3050
• Network Type - Peoplenet pretrained model
• TAO toolkit Version - 3.21.11
Approach 1
I download the model file in .etlt format and try to use tao converter
to generate an engine for inference.
tao converter $LOCAL_EXPERIMENT_DIR/model/peoplenet_vdeployable_quantized_v2.5/resnet34_peoplenet_int8.etlt \
-k $KEY \
-o output_cov/Sigmoid,output_bbox/BiasAdd \
-d 3,544,960 \
-i nchw \
-m 64 \
-e $USER_EXPERIMENT_DIR/experiment_dir_final/peoplenet_detector.trt
This fails with the the following error message -
2021-12-06 09:11:47,394 [INFO] root: Registry: ['nvcr.io']
2021-12-06 09:11:47,430 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.5-py3
2021-12-06 09:11:47,440 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/sitevisit/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[INFO] [MemUsageChange] Init CUDA: CPU +535, GPU +0, now: CPU 1883, GPU 815 (MiB)
[ERROR] UffParser: Unsupported number of graph 0
[ERROR] Failed to parse the model, please check the encoding key to make sure it's correct
[ERROR] 4: [network.cpp::validate::2411] Error Code 4: Internal Error (Network must have at least one output)
[ERROR] Unable to create engine
2021-12-06 09:11:48,593 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
I have checked the encoding key multiple times, it is set to the default which is tlt_encode
. I have not done any training steps, just set up the environment variables and downloaded the model prior to this, so there is no way the key would have changed as well.
Approach 2
I also try to run inference directly on the .etlt model file using the tao detectnet_v2 inference
command. Here is the command I am using -
tao detectnet_v2 inference -e $SPECS_DIR/detectnet_v2_inference_kitti_etlt.txt \
-o $USER_EXPERIMENT_DIR/tlt_infer_testing \
-i $DATA_DOWNLOAD_DIR/testing/image_2 \
-k $KEY
I will attach detectnet_v2_inference_kitti_etlt.txt
in this post.
The error I am encountering is as follows -
2021-12-06 11:13:39,281 [INFO] root: Registry: ['nvcr.io']
2021-12-06 11:13:39,317 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.21.11-tf1.15.4-py3
2021-12-06 11:13:39,330 [WARNING] tlt.components.docker_handler.docker_handler:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/sitevisit/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2021-12-06 05:43:43,538 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /workspace/tao-experiments/detectnet_v2/specs/detectnet_v2_inference_kitti_etlt.txt
2021-12-06 05:43:43,540 [INFO] __main__: Overlain images will be saved in the output path.
2021-12-06 05:43:43,540 [INFO] iva.detectnet_v2.inferencer.build_inferencer: Constructing inferencer
Traceback (most recent call last):
File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py", line 210, in <module>
File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py", line 206, in main
File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/inference.py", line 103, in inference_wrapper_batch
File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/inferencer/build_inferencer.py", line 112, in build_inferencer
File "/opt/tlt/.cache/dazel/_dazel_tlt/75913d2aee35770fa76c4a63d877f3aa/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/inferencer/trt_inferencer.py", line 157, in __init__
AssertionError: Either a valid tensorfile must be present or a cache file.
2021-12-06 11:13:44,253 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.
I am not understanding where I am going wrong, as I am just downloading the pretrained model from nvidia and using the exact steps mentioned in the sample notebook.
Please help me modify either of these approaches to be able to infer properly, or suggest what else I can do.
Thanks
detectnet_v2_inference_kitti_etlt.txt (2.6 KB)