Failed to decode TrafficCamNet from etlt to ONNX

Please provide the following information when requesting support.

• Hardware - GeForce RTX 3050
• Network Type - TrafficCamNet
• TLT Version - nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

I followed the advice shared Fpenet retraining output file onnx but deepstream is using tlt - #12 by Morganh for decoding an etlt file to onnx.

steps to reproduce

  1. downloaded the etlt file using
wget --content-disposition 'https://api.ngc.nvidia.com/v2/models/org/nvidia/team/tao/trafficcamnet/pruned_v1.0.3/files?redirect=true&path=resnet18_trafficcamnet_pruned.etlt' -O resnet18_trafficcamnet_pruned.etlt
  1. started the container and mounted the folder with the file and a following script
import argparse
import struct
# import encoding
from nvidia_tao_tf1.encoding import encoding

def parse_command_line(args):
    '''Parse command line arguments.'''
    parser = argparse.ArgumentParser(description='ETLT Decode Tool')
    parser.add_argument('-m',
                        '--model',
                        type=str,
                        required=True,
                        help='Path to the etlt file.')
    parser.add_argument('-o',
                        '--uff',
                        required=True,
                        type=str,
                        help='The path to the uff file.')
    parser.add_argument('-k',
                        '--key',
                        required=True,
                        type=str,
                        help='encryption key.')
    return parser.parse_args(args)


def decode(tmp_etlt_model, tmp_uff_model, key):
    with open(tmp_uff_model, 'wb') as temp_file, open(tmp_etlt_model, 'rb') as encoded_file:
        size = encoded_file.read(4)
        size = struct.unpack("<i", size)[0]
        input_node_name = encoded_file.read(size)
        encoding.decode(encoded_file, temp_file, key.encode())


def main(args=None):
    args = parse_command_line(args)
    decode(args.model, args.uff, args.key)
    print("Decode successfully.")


if __name__ == "__main__":
    main()
  1. decoded the etlt file using the command
python decode_etlt.py -m trafficcamnet/resnet18_trafficcamnet_pruned.etlt -o trafficcamnet/trafficcamnet.onnx -k tlt_encode

which printed
Decode successfully.

  1. started a python console which has onnx_runtime installed and run the commands
import onnxruntime as ort
trafficcamnet_path = "trafficcamnet/trafficcamnet.onnx"
session = ort.InferenceSession(trafficcamnet_path)

which resulted in the following error

---------------------------------------------------------------------------
Fail                                      Traceback (most recent call last)
Cell In[21], line 2
      1 trafficcamnet_path = "trafficcamnet/trafficcamnet.onnx"
----> 2 session = ort.InferenceSession(trafficcamnet_path)

File ~/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:419, in InferenceSession.__init__(self, path_or_bytes, sess_options, providers, provider_options, **kwargs)
    416 disabled_optimizers = kwargs["disabled_optimizers"] if "disabled_optimizers" in kwargs else None
    418 try:
--> 419     self._create_inference_session(providers, provider_options, disabled_optimizers)
    420 except (ValueError, RuntimeError) as e:
    421     if self._enable_fallback:

File ~/.local/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:452, in InferenceSession._create_inference_session(self, providers, provider_options, disabled_optimizers)
    450 session_options = self._sess_options if self._sess_options else C.get_default_session_options()
    451 if self._model_path:
--> 452     sess = C.InferenceSession(session_options, self._model_path, True, self._read_config_from_model)
    453 else:
    454     sess = C.InferenceSession(session_options, self._model_bytes, False, self._read_config_from_model)

Fail: [ONNXRuntimeError] : 1 : FAIL : Load model from trafficcamnet/trafficcamnet.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:134 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) ModelProto does not have a graph.

Is there something missing ? is there a limitation on decoding pruned models?

Please save to .uff file instead of .onnx file. Then you can use Netron to open this uff file.
For trafficCamnet in ngc model card, it is based on detectnet_v2 network. Previously, it is a uff file.

In TAO Toolkit versions 5.0.0 and later, the tao model detectnet_v2 export command serializes the exported model directly to an unencrypted .onnx file. You can use the command to export.

Thanks for the quick reply. I had a look at Netron and it seems suitable only for inspecting the network. What I want to do is to the use model for inference.

Is there way to use UFF models for inference using python?

You can use TAO 5.0 to export tlt file in TrafficCamNet | NVIDIA NGC to onnx file.

I have a few more questions:

  1. where is it documented whether the model is originally onnx/uff or other?

  2. How can I use the TAO 5.0 to export tlt file ? I wasn’t able to find any documentation on how to do it.

I’ve found this post TLT to onnx at TAO5.0 but was not able to find the decode_eff.py file inside the container. I looked both in nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf2.11.0 and nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5

  1. is there any limitation on converting pruned models to onnx?

You can refer to Integrating TAO Models into DeepStream - NVIDIA Docs.

You can refer to DetectNet_v2 - NVIDIA Docs or notebook

There is not limitation for this.

I’ve followed the instructions described here DetectNet_v2 - NVIDIA Docs but I’m getting errors trying to convert both TrafficCamNet | NVIDIA NGC and TAO Pretrained DetectNet V2 | NVIDIA NGC

strangely I get different error. For trafficcamnet

tao model detectnet_v2 export -m ./trafficcamnet/resnet18_trafficcamnet.tlt -o ./trafficcamnet/resnet18_trafficcamnet.onnx

I get the error

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tao_mounts.json.
Please note that this will be deprecated going forward.
2023-12-06 13:19:08,449 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-12-06 13:19:08,511 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 361: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-12-06 13:19:08,550 [TAO Toolkit] [INFO] root 99: No mount points were found in the /home/omri/.tao_mounts.json file.
2023-12-06 13:19:08,550 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 267: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/omri/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-12-06 13:19:08,550 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
2023-12-06 03:19:09.478620: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-12-06 03:19:09,507 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2023-12-06 03:19:10,272 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:19:10,291 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:19:10,293 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:19:11,195 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:19:12,146 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:19:12,164 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:19:12,165 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:19:12,461 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.app 264: Saving exported model to ./trafficcamnet/resnet18_trafficcamnet.onnx
2023-12-06 03:19:12,461 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.keras_exporter 119: Setting the onnx export route to keras2onnx
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/export.py", line 42, in <module>
    raise e
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/export.py", line 26, in <module>
    launch_export(Exporter, backend="onnx")
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/export/app.py", line 323, in launch_export
    run_export(Exporter, args, backend)
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/export/app.py", line 286, in run_export
    exporter.set_keras_backend_dtype()
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/export/keras_exporter.py", line 132, in set_keras_backend_dtype
    tmp_keras_file_name = get_decoded_filename(self.model_path,
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/utils.py", line 400, in get_decoded_filename
    raise ValueError("Cannot find input file name.")
ValueError: Cannot find input file name.
Execution status: FAIL
2023-12-06 13:19:22,017 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 337: Stopping container.

and for

tao model detectnet_v2 export -m resnet18.hdf5 -o resnet18.onnx

The error is

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tao_mounts.json.
Please note that this will be deprecated going forward.
2023-12-06 13:18:29,642 [TAO Toolkit] [INFO] root 160: Registry: ['nvcr.io']
2023-12-06 13:18:29,701 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 361: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2023-12-06 13:18:29,735 [TAO Toolkit] [INFO] root 99: No mount points were found in the /home/omri/.tao_mounts.json file.
2023-12-06 13:18:29,735 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 267: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/omri/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
2023-12-06 13:18:29,735 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 275: Printing tty value True
2023-12-06 03:18:30.668258: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-12-06 03:18:30,697 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2023-12-06 03:18:31,467 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:18:31,486 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:18:31,488 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:18:32,392 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:18:33,358 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:18:33,375 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:18:33,377 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 03:18:33,680 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.app 264: Saving exported model to resnet18.onnx
2023-12-06 03:18:33,681 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.keras_exporter 119: Setting the onnx export route to keras2onnx
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/export.py", line 42, in <module>
    raise e
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/detectnet_v2/scripts/export.py", line 26, in <module>
    launch_export(Exporter, backend="onnx")
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/export/app.py", line 323, in launch_export
    run_export(Exporter, args, backend)
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/export/app.py", line 286, in run_export
    exporter.set_keras_backend_dtype()
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/cv/common/export/keras_exporter.py", line 134, in set_keras_backend_dtype
    model_input_dtype = get_model_input_dtype(tmp_keras_file_name)
  File "/usr/local/lib/python3.8/dist-packages/nvidia_tao_tf1/core/export/app.py", line 68, in get_model_input_dtype
    with h5py.File(keras_hdf5_file, mode="r") as f:
  File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/files.py", line 312, in __init__
    fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
  File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/files.py", line 142, in make_fid
    fid = h5f.open(name, flags, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 78, in h5py.h5f.open
OSError: Unable to open file (unable to open file: name = 'resnet18.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
Execution status: FAIL
2023-12-06 13:18:38,332 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 337: Stopping container.

I’ve checked that both files have the correct path

@Morganh I was able to make some progress by running the following command

docker run -it --rm --gpus all -v ./:/home/tao/ nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5 detectnet_v2 export -m /home/tao/resnet18.hdf5 -o /home/tao/resnet18.onnx

which successfully converts the hdf5 to a valid onnx model. However, when I try to convert the trafficnet model using a similar command

docker run -it --rm --gpus all -v ./:/home/tao/ nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5 detectnet_v2 export -m /home/tao/resnet18_trafficcamnet.tlt -o /home/tao/resnet18_trafficcamnet.onnx

It doesn’t produce an output file. The logs don’t indicate any errors

=======================
=== TAO Toolkit TF1 ===
=======================

NVIDIA Release 5.0.0-TF1 (build 52693369)
TAO Toolkit Version 5.0.0

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/tao-toolkit-software-license-agreement

NOTE: The SHMEM allocation limit is set to the default of 64MB.  This may be
   insufficient for TAO Toolkit.  NVIDIA recommends the use of the following flags:
   docker run --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 ...

2023-12-06 04:17:52.409472: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2023-12-06 04:17:52,437 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2023-12-06 04:17:53,192 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 04:17:53,211 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 04:17:53,213 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 04:17:54,100 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 04:17:55,049 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable  TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 04:17:55,066 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable  TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 04:17:55,068 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable  TF_ALLOW_IOLIBS=1.
2023-12-06 04:17:55,366 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.app 264: Saving exported model to /home/tao/resnet18_trafficcamnet.onnx
2023-12-06 04:17:55,366 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.keras_exporter 119: Setting the onnx export route to keras2onnx
Execution status: PASS

Is it correct to try and convert trafficnet using detectnet_v2 export... ? is there anything else missing for it to work?

I found the missing ingredient: it was the encryption key. The command

docker run -it --rm --gpus all -v ./:/home/tao/ nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5 detectnet_v2 export -m /home/tao/resnet18_trafficcamnet.tlt -o /home/tao/resnet18_trafficcamnet.onnx -k tlt_encode

works.

One thing the would be great to fix is a warning/error message in cases where the key is missing.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.