Tao toolkit observations

Morganh · May 29, 2024, 7:24am

The .etlt is actually encrypted onnx file. Since TAO5.0, the source code is open and the training result will be .onnx instead. Under deepstream environment, you can use trtexec to generate tensorrt engine.
TRTEXEC with LPRNet - NVIDIA Docs. Then only config model-engine-file. Refer to deepstream_lpr_app/deepstream-lpr-app/lpr_config_sgie_us.yml at master · NVIDIA-AI-IOT/deepstream_lpr_app · GitHub.

model-engine-file=models/LP/LPR/us_lprnet_baseline18_deployable.etlt_b16_gpu0_fp16.engine
labelfile-path=models/LP/LPR/labels_us.txt
#tlt-encoded-model=models/LP/LPR/us_lprnet_baseline18_deployable.etlt
#tlt-model-key=nvidia_tlt

foreverneilyoung · May 29, 2024, 7:32am

The .etlt is actually encrypted onnx file.

Well, yes… Sorry, if you know, you know…

model-engine-file=models/LP/LPR/us_lprnet_baseline18_deployable.etlt_b16_gpu0_fp16.engine

You mean that should now be the model file created during the training?

model-engine-file=models/LP/LPR/lprnet_epoch-024.fp16.engine

And I should comment these two entries in my config?

#tlt-encoded-model=models/LP/LPR/us_lprnet_baseline18_deployable.etlt
#tlt-model-key=nvidia_tlt

I don’t have a trtexec available, but from the documentation it looks like I could run that via the docker. Is that true?

foreverneilyoung · May 29, 2024, 7:35am

I did find this here https://docs.nvidia.com/tao/tao-toolkit/text/character_recognition/lprnet.html?_gl=1*7mn606*_gcl_au*NzI5Nzk2NTUzLjE3MTEyNTg3Mjk.

tao model lprnet export -m <model>
                  -k <key>
                  -e <experiment_spec>
                  [--gpu_index <gpu_index>]
                  [--log_file <log_file>]
                  [-o <output_file>]
                  [--data_type {fp32,fp16}]
                  [--max_workspace_size <max_workspace_size>]
                  [--max_batch_size <max_batch_size>]
                  [--engine_file <engine_file>]
                  [-v]

but it’s always just producing a *.onnx

Morganh · May 29, 2024, 7:38am

The onnx file is also supported in deepstream_lpr_app. Please refer to below.
You can config the onnx file directly.
deepstream_lpr_app/deepstream-lpr-app/lpr_config_pgie.txt at master · NVIDIA-AI-IOT/deepstream_lpr_app · GitHub.

foreverneilyoung · May 29, 2024, 7:42am

Wow, indeed. It works (Note: It is still the original training configuration, not able to deal with spaces)

Now, this is a good start to add my own training data set and retry… Will be back for more questions, for sure…

Thanks so far, this makes my day

foreverneilyoung · May 29, 2024, 7:53am

Hmm. Haven’t seen this until now. My reference was this so far

github.com

NVIDIA-AI-IOT/deepstream_lpr_app/blob/c6c1cbcaeae423ba75c5b8b7668bada6dbf1b15c/deepstream-lpr-app/lpr_config_sgie_us.txt#L51


      
          # Other optional properties:
          #   net-scale-factor(Default=1), network-mode(Default=0 i.e FP32),
          #   mean-file, gie-unique-id(Default=0), offsets, gie-mode (Default=1 i.e. primary),
          #   custom-lib-path, network-mode(Default=0 i.e FP32)
          #
          # The values in the config file are overridden by values set through GObject
          # properties.
          
          [property]
          gpu-id=0
          model-engine-file=../models/LP/LPR/us_lprnet_baseline18_deployable.etlt_b16_gpu0_fp16.engine
          labelfile-path=../models/LP/LPR/labels_us.txt
          tlt-encoded-model=../models/LP/LPR/us_lprnet_baseline18_deployable.etlt
          tlt-model-key=nvidia_tlt
          batch-size=16
          ## 0=FP32, 1=INT8, 2=FP16 mode
          network-mode=2
          num-detected-classes=3
          gie-unique-id=3
          output-blob-names=tf_op_layer_ArgMax;tf_op_layer_Max
          #0=Detection 1=Classifier 2=Segmentation

This clearly uses the *.elt. What is the difference to the other config?

foreverneilyoung · May 29, 2024, 7:54am

And what describes the “num-detected-classes=45” here?

Morganh · May 29, 2024, 8:16am

It is not been used since the postprocessing is customized. It can be removed.
I will sync with deepstream team about it.

foreverneilyoung · May 29, 2024, 8:19am

Thanks. Learned a lot lately

foreverneilyoung · May 29, 2024, 8:21am

But again: I see the configuration you where quoting was for pgie, mine for sgie. Is that somehow an important difference? And if - what is the difference at all? I never got the magic behind pgie and sgie. I always thought, if an inference is depending on another it is a secondary. So the LPD depends on LPR, I would say, secondary is OK. Confusing…

Morganh · May 29, 2024, 9:11am

I will sync with deepstream team to check it further. Thanks for the catching.

foreverneilyoung · May 29, 2024, 9:13am

Thanks. And I can confirm: It is not possible to train lprnet with [space] or something, which is a pity.

But with the nice experience of lprnet I will retry ocrnet now. What notebook would you suggest? There are two.

Morganh · May 29, 2024, 9:17am

For OCRNet, please use vit version. tao_tutorials/notebooks/tao_launcher_starter_kit/ocrnet/ocrnet-vit.ipynb at main · NVIDIA/tao_tutorials · GitHub because it is needed to use attention decode way to retrain the OCRNet model.

foreverneilyoung · May 29, 2024, 9:24am

Yes, this was the one I already started with in my first attempt.

One thing: While taking over your ONNX configuration I got an error at runtime:

ERROR: [TRT]: 3: Cannot find binding of given name: output_bbox/BiasAdd
0:00:09.626027892 13488      0x1e2b760 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<sgie2-lpr> NvDsInferContext[UID 3]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:2062> [UID = 3]: Could not find output layer 'output_bbox/BiasAdd' in engine
ERROR: [TRT]: 3: Cannot find binding of given name: output_cov/Sigmoid
0:00:09.626062854 13488      0x1e2b760 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<sgie2-lpr> NvDsInferContext[UID 3]: Warning from NvDsInferContextImpl::checkBackendParams() <nvdsinfer_context_impl.cpp:2062> [UID = 3]: Could not find output layer 'output_cov/Sigmoid' in engine
0:00:09.627751012 13488      0x1e2b760 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<sgie2-lpr> NvDsInferContext[UID 3]: Info from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2198> [UID = 3]: Use deserialized engine model: /home/ubuntu/vx-ai-golang/models/LP/LPR/lprnet_epoch-024.onnx_b16_gpu0_fp16.engine

What to do with this?

Morganh · May 29, 2024, 9:26am

Yes, I think that is a problem for the config. These nodes are not from LPR but from LPD.
Could you please create a topic in deepstream forum for the issues you found in https://github.com/NVIDIA-AI-IOT/deepstream_lpr_app/blob/master/deepstream-lpr-app/lpr_config_pgie.txt? Thanks.

foreverneilyoung · May 29, 2024, 9:27am

Yepp. Will do

foreverneilyoung · May 29, 2024, 11:21am

One thing I forgot to mention: If you are running TAO on Ubuntu 22.04 with Python 3.10 (and maybe >) there is a problem with the requests package, which has nothing to do with TAO, but hits TAO when trying to run any docker.

github.com/docker/docker-py

Breaks with requests 2.32.0: Not supported URL scheme http+docker

opened 05:26PM - 20 May 24 UTC

closed 01:03PM - 22 May 24 UTC

rra

With requests 2.32.0 (released about an hour ago as I write this), the docker li…brary as called by [tox-docker](https://github.com/tox-dev/tox-docker) fails with the following exception: ``` Traceback (most recent call last): File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/requests/adapters.py", line 532, in send conn = self._get_connection(request, verify, proxies=proxies, cert=cert) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/requests/adapters.py", line 400, in _get_connection conn = self.poolmanager.connection_from_host( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/urllib3/poolmanager.py", line 304, in connection_from_host return self.connection_from_context(request_context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/urllib3/poolmanager.py", line 326, in connection_from_context raise URLSchemeUnknown(scheme) urllib3.exceptions.URLSchemeUnknown: Not supported URL scheme http+docker During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/api/client.py", line 214, in _retrieve_server_version return self.version(api_version=False)["ApiVersion"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/api/daemon.py", line 181, in version return self._result(self._get(url), json=True) ^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/utils/decorators.py", line 46, in inner return f(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/api/client.py", line 237, in _get return self.get(url, **self._set_request_timeout(kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/requests/sessions.py", line 602, in get return self.request("GET", url, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/requests/adapters.py", line 534, in send raise InvalidURL(e, request=request) requests.exceptions.InvalidURL: Not supported URL scheme http+docker During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/tox/session/cmd/run/single.py", line 48, in _evaluate code, outcomes = run_commands(tox_env, no_test) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/tox/session/cmd/run/single.py", line 79, in run_commands MANAGER.tox_before_run_commands(tox_env) File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/tox/plugin/manager.py", line 88, in tox_before_run_commands self.manager.hook.tox_before_run_commands(tox_env=tox_env) File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/pluggy/_hooks.py", line 513, in __call__ return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/pluggy/_manager.py", line 120, in _hookexec return self._inner_hookexec(hook_name, methods, kwargs, firstresult) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/pluggy/_callers.py", line 139, in _multicall raise exception.with_traceback(exception.__traceback__) File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/pluggy/_callers.py", line 103, in _multicall res = hook_impl.function(*args) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/tox_docker/tox4/plugin.py", line 73, in tox_before_run_commands docker_build_or_pull(container_config, log) File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/tox_docker/plugin.py", line 57, in docker_build_or_pull docker_pull(container_config, log) File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/tox_docker/plugin.py", line 65, in docker_pull docker = docker_module.from_env(version="auto") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/client.py", line 96, in from_env return cls( ^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/client.py", line 45, in __init__ self.api = APIClient(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/api/client.py", line 197, in __init__ self._version = self._retrieve_server_version() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/eagle/dvl/venvs/gafaelfawr/lib/python3.12/site-packages/docker/api/client.py", line 221, in _retrieve_server_version raise DockerException( docker.errors.DockerException: Error while fetching server API version: Not supported URL scheme http+docker ``` Reverting to requests 2.31.0 without any other changes fixes the problem.

Downgrading requests to 2.31.0 helps. Maybe you can check that with your requirements.

Topic		Replies	Views
Tao Training Model Error TAO Toolkit	7	495	January 15, 2024
Tao toolkit detectnet training kitty format error TAO Toolkit	10	417	December 8, 2023
TAO 5.0 failed to train TAO Toolkit	8	546	August 1, 2023
Detectnet2 TAO Toolkit model training fail on formating dataset on kitti format TAO Toolkit	69	966	January 22, 2024
Detectnet_v2 notebook stuck at tfrecords conversion step TAO Toolkit	17	51	October 30, 2024
Excute tao model detectnet_v2 train but Failed TAO Toolkit tao	5	216	June 4, 2024
Detectnet_v2.ipynb issue with custom data TAO Toolkit tao	3	276	May 17, 2024
Detectnetv2 tfrecords error TAO Toolkit	4	423	January 13, 2024
Tao-converter [ERROR] Failed to parse the model, please check the encoding key to make sure its correct TAO Toolkit deepstream	70	1705	July 10, 2023
Problem of tao detectnet_v2 evaluate 0% TAO Toolkit python	21	394	July 7, 2023

Tao toolkit observations

Related topics