Tao deploy error -

ziv9 · January 30, 2025, 3:36pm

Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc) NVIDIA RTX A5000
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Yolo_v4
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)Docker version 27.5.1, build 9f9e405
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)

My goal is to upgrade the module engine from deepstream 5.x to 7.x
What should be the correct content of export_config.txt?

$ tao deploy yolo_v4 gen_trt_engine -m /workspace/tao_tutorials/sigg_train_000_yolov4_mobilenet_v2_epoch_113.etlt -k MW05bWZiZ25jMmJzMWRubDY0M2hodWJoYzI6NWY5YzMwNzUtZDQyYi00NzdmLWIzYmMtNmZlM2NkZDQ3OTAw -e /workspace/tao_tutorials/export_config.txt --engine_file /workspace/tao_tutorials/output/yolov4_mobilenet_v2.engine -r /workspace/tau_tutorials/output
2025-01-30 17:28:54,248 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2025-01-30 17:28:54,279 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.5.0-deploy
2025-01-30 17:28:54,315 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 288:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/zivh/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
2025-01-30 17:28:54,315 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
[2025-01-30 15:28:55,297 - TAO Toolkit - matplotlib.font_manager - INFO] generated new fontManager
2025-01-30 15:28:55,923 [TAO Toolkit] [INFO] root 167: Starting yolo_v4 gen_trt_engine.
2025-01-30 15:28:55,923 [TAO Toolkit] [INFO] root 55: The provided .etlt file is in UFF format.
2025-01-30 15:28:55,924 [TAO Toolkit] [INFO] root 56: Input name: b’Input’
2025-01-30 15:28:55,963 [TAO Toolkit] [INFO] root 167: 1:2 : Message type “Experiment” does not have extensions.
Traceback (most recent call last):
File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_deploy/cv/yolo_v4/scripts/gen_trt_engine.py”, line 207, in
main(args)
File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_deploy/cv/common/decorators.py”, line 63, in _func
raise e
File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_deploy/cv/common/decorators.py”, line 47, in _func
runner(cfg, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_deploy/cv/yolo_v4/scripts/gen_trt_engine.py”, line 44, in main
es = load_proto(args.experiment_spec)
File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_deploy/cv/yolo_v4/proto/utils.py”, line 36, in load_proto
_load_from_file(config, proto)
File “/usr/local/lib/python3.10/dist-packages/nvidia_tao_deploy/cv/yolo_v4/proto/utils.py”, line 35, in _load_from_file
merge_text_proto(f.read(), pb2)
File “/usr/local/lib/python3.10/dist-packages/google/protobuf/text_format.py”, line 719, in Merge
return MergeLines(
File “/usr/local/lib/python3.10/dist-packages/google/protobuf/text_format.py”, line 793, in MergeLines
return parser.MergeLines(lines, message)
File “/usr/local/lib/python3.10/dist-packages/google/protobuf/text_format.py”, line 818, in MergeLines
self._ParseOrMerge(lines, message)
File “/usr/local/lib/python3.10/dist-packages/google/protobuf/text_format.py”, line 837, in _ParseOrMerge
self._MergeField(tokenizer, message)
File “/usr/local/lib/python3.10/dist-packages/google/protobuf/text_format.py”, line 884, in _MergeField
raise tokenizer.ParseErrorPreviousToken(
google.protobuf.text_format.ParseError: 1:2 : Message type “Experiment” does not have extensions.
[2025-01-30 15:28:56,025 - TAO Toolkit - nvidia_tao_deploy.cv.common.entrypoint.entrypoint_proto - INFO] Sending telemetry data.
[2025-01-30 15:28:56,025 - TAO Toolkit - root - INFO] ================> Start Reporting Telemetry <================
[2025-01-30 15:28:56,025 - TAO Toolkit - root - INFO] Sending {‘version’: ‘5.5.0’, ‘action’: ‘gen_trt_engine’, ‘network’: ‘yolo_v4’, ‘gpu’: [‘NVIDIA-RTX-A5000’], ‘success’: False, ‘time_lapsed’: 0.4979846477508545} to https://api.tao.ngc.nvidia.com.
[2025-01-30 15:28:57,123 - TAO Toolkit - root - INFO] Telemetry sent successfully.
[2025-01-30 15:28:57,124 - TAO Toolkit - root - INFO] ================> End Reporting Telemetry <================
[2025-01-30 15:28:57,124 - TAO Toolkit - nvidia_tao_deploy.cv.common.entrypoint.entrypoint_proto - INFO] Execution status: FAIL
2025-01-30 17:28:57,211 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.

$ nano export_config.txt
[export_config]
k: MW05b…NmZlM2NkZDQ3OTAw
input_type=etlt
output_type=engine
input_shape=3,384,512
output_nodes=BatchedNMS
batch_size=16
precision=fp32

Morganh · January 30, 2025, 10:34pm

For TAO5.5 deploy docker, please refer to its 5.5 notebook: tao_tutorials/notebooks/tao_launcher_starter_kit/yolo_v4/yolo_v4.ipynb at tao_5.5_release · NVIDIA/tao_tutorials · GitHub. You can find the command

!tao model yolo_v4 export -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.hdf5 \
                    -o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_$EPOCH.onnx \
                    -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                    --target_opset 12 \
                    --gen_ds_config

The reference config file can be found in tao_tutorials/notebooks/tao_launcher_starter_kit/yolo_v4/specs/yolo_v4_retrain_resnet18_kitti.txt at tao_5.5_release · NVIDIA/tao_tutorials · GitHub.

ziv9 · February 2, 2025, 6:53am

Running the code you suggested I get this message “Saving exported model to /workspace/tao_tutorials/convert/output/new3.onnx” , but there are no files in output directory and there is no Error message. can you explain?

another question -
Is this procedure really necessary for running my .engine that was built on Deepstream 5.x + TensorRT 7.1.3, to run on Deepstream 7.x + TensorRT 10.3.0 ? or is it possible to just replace the .engine with .etlt file (if so please refer to example)

tao model yolo_v4 export -m /workspace/tao_tutorials/convert/sigg_train_000_yolov4_mobilenet_v2_epoch_113.etlt -o /workspace/tao_tutorials/convert/output/new3.onnx -e /workspace/tao_tutorials/convert/sigg_train_000_yolo_v4_train_mobilenet_v2_kitti.txt --target_opset 12 --gen_ds_config
2025-02-02 08:33:21,082 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2025-02-02 08:33:21,117 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5
2025-02-02 08:33:21,208 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 288:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/home/zivh/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
2025-02-02 08:33:21,208 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
Using TensorFlow backend.
2025-02-02 06:33:22.290415: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12
2025-02-02 06:33:22,356 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
2025-02-02 06:33:23,286 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2025-02-02 06:33:23,310 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2025-02-02 06:33:23,313 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2025-02-02 06:33:24,454 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
2025-02-02 06:33:25,615 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
2025-02-02 06:33:25,634 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1.
WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2025-02-02 06:33:25,636 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1.
2025-02-02 06:33:25,992 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.app 264: Saving exported model to /workspace/tao_tutorials/convert/output/new3.onnx
2025-02-02 06:33:25,993 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.keras_exporter 119: Setting the onnx export route to keras2onnx
Execution status: PASS
2025-02-02 08:33:30,030 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.

Morganh · February 2, 2025, 1:39pm

The input of exporting should be .hdf5 file or .tlt file. Please modify. Yours is .etlt which is not expected. Suggest you to run the notebook to get familiar with it.

ziv9 · February 6, 2025, 9:12am

I tried to deploy with the .tlt file but got the same result - no files in output directory.

I want to clarify my goal again
Due to upgrading our Xavier NX to Orin NX, I got this error:
“Trying to load an engine created with incompatible serialization version. Check that the engine was not created using safety runtime, same OS was used and version compatibility parameters were set accordingly.)”

My deploy efforts were done on a Desktop x86 computer - UBUNTU 20.04 + NVIDIA A5000, CUDA 11.4 + tao-5.5.0 , and also on my X86 laptop - UBUNTU 24.04 +RTX 2000, CUDA 12.4
(as I read the docs the deploy can’t be done on ARM Xavier NX or Orin NX - tao can’t run on ARM. correct?)

can you please clarify if both computers are OK for doing the deploy? which setup is needed?
as there are multiple pre-requisites/configurations/version

Morganh · February 7, 2025, 3:05am

You can copy the onnx file to your Jetson device. Then run trtexec to generate TensorRT engine.
An example command can be found in TRTEXEC with YOLO_v4 - NVIDIA Docs.

ziv9 · February 8, 2025, 7:22pm

I don’t have an .onnx file, and failed to generate one.
please read again my last Feb 6 response it’s all about that and contain several questions
thank you

Morganh · February 9, 2025, 2:25am

Did you ever try to run the YOLO_v4 notebook tao_tutorials/notebooks/tao_launcher_starter_kit/yolo_v4/yolo_v4.ipynb at tao_5.5_release · NVIDIA/tao_tutorials · GitHub? If not, please try to run. Again, when you run deploy command, the input file is .hdf5 file(previously .tlt file), the output file is .onnx file(i…e, previously we call it .etlt file). As mentioned above, you did not run the command correctly. Your input is set to .etlt file which is unexpected.

Please use the correct command to generate .onnx file. For TAO5.0 or later version, it is named to .onnx file instead of .etlt file.

The deployment can be done both on dgpu or Jetson devices. The simple way is to generate tensort engine based on the .onnx file.

ziv9 · February 17, 2025, 4:06pm

Hello Morganh
I run the YOLO_v4 notebook and manged to create hdf5 files in the train section

I managed to generate .onnx file via command “tao model yolo_v4 export…” only when the input is .hdf5 file.
when the input is .tlt file, there is a message that .onnx file was generated but no file generated

Please advise how to convert .tlt file to .onnx (otherwise the meaning is to retrain all of my models)

Morganh · February 18, 2025, 2:23am

Please use below trick to convert .tlt file to .hdf5. Then use it to export to onnx file.

tao_toolkit_recipes/tao_forum_faq/FAQ.md at main · NVIDIA-AI-IOT/tao_toolkit_recipes · GitHub.

ziv9 · February 20, 2025, 11:11am

Hi,
After generating .onnx and .engine files both on my Desktop x86 ( UBUNTU 20.04 + NVIDIA A5000, CUDA 11.4 + tao-5.5.0) and also generated on AWS EC2, when running the .engine on Jetson Orin machine,
I get this error:
ERROR: [TRT]: IRuntime::deserializeCudaEngine: Error Code 1: Serialization (Serialization assertion stdVersionRead == kSERIALIZATION_VERSION failed.Version tag does not match. Note: Current Version: 239, Serialized Engine Version: 236)
ERROR: Deserialize engine failed from file: /media/prog/home/mic-710aix/robot/src/detection_pipeline/models/yolov4_resnet18_epoch_011.engine

Your advice was to generate the .onnx and .engine on Jetson? and also to use trtexec (see failure below)
Trying to generate .onnx on Jetson Orin, causes the machine to crash:
Error Details-
!tao model yolo_v4 export -m $SPECS_DIR/yolov4_resnet18_epoch_010.hdf5
-o $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_010.onnx
-e $SPECS_DIR/sigg_train_000_yolo_v4_train_mobilenet_v2_kitti.txt
–target_opset 12
–gen_ds_config

After failure, I uploaded the .onnx fie from my Desktop and tried to generate an .engine:
!tao deploy yolo_v4 gen_trt_engine -m $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_010.onnx
-e $SPECS_DIR/sigg_train_000_yolo_v4_train_mobilenet_v2_kitti.txt
–batch_size 16
–min_batch_size 1
–opt_batch_size 8
–max_batch_size 16
–data_type fp32
–engine_file $USER_EXPERIMENT_DIR/export/yolov4_resnet18_epoch_011.engine
–results_dir $USER_EXPERIMENT_DIR/export

2025-02-20 13:40:21,815 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’]
2025-02-20 13:40:22,008 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.5.0-deploy
2025-02-20 13:40:22,096 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 288:
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the “user”:“UID:GID” in the
DockerOptions portion of the “/media/prog/home/mic-710aix/.tao_mounts.json” file. You can obtain your
users UID and GID by using the “id -u” and “id -g” commands on the
terminal.
2025-02-20 13:40:22,096 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True
Error response from daemon: container cac417124eb791ad25d4eb8fafbd6112304bc7c601dd38a360ea4ad7e259848b is not running
2025-02-20 13:40:23,112 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.
Traceback (most recent call last):
File “/home/mic-710aix/.pyenv/versions/3.10.1/lib/python3.10/site-packages/docker/api/client.py”, line 259, in _raise_for_status
response.raise_for_status()
File “/home/mic-710aix/.pyenv/versions/3.10.1/lib/python3.10/site-packages/requests/models.py”, line 1021, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: http+docker://localhost/v1.47/containers/cac417124eb791ad25d4eb8fafbd6112304bc7c601dd38a360ea4ad7e259848b/stop

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “/home/mic-710aix/.pyenv/versions/3.10.1/bin/tao”, line 8, in
sys.exit(main())
File “/home/mic-710aix/.pyenv/versions/3.10.1/lib/python3.10/site-packages/nvidia_tao_cli/entrypoint/tao_launcher.py”, line 134, in main
instance.launch_command(
…
raise create_api_error_from_http_exception(e)
File “/home/mic-710aix/.pyenv/versions/3.10.1/lib/python3.10/site-packages/docker/errors.py”, line 31, in create_api_error_from_http_exception
raise cls(e, response=response, explanation=explanation)
docker.errors.NotFound: 404 Client Error: Not Found (“No such container: cac417124eb791ad25d4eb8fafbd6112304bc7c601dd38a360ea4ad7e259848b”)

Is it possible to resolve this on my Desktop ?
I also attached the failure of trtexec in the next response

ziv9 · February 20, 2025, 12:07pm

I also tried trtexec but received the error below:
trtexec --onnx=yolov4_resnet18_epoch_010.onnx
–maxShapes=Input:16x3x384x1248
–minShapes=Input:1x3x384x1248
–optShapes=Input:8x3x384x1248
–calib=calib.txt
–fp16
–int8
–saveEngine=trt_model.engine

(note i assume calib.txt is generated?)

&&&& RUNNING TensorRT.trtexec [TensorRT v100300] # trtexec --onnx=yolov4_resnet18_epoch_010.onnx --maxShapes=Input:16x3x384x1248 --minShapes=Input:1x3x384x1248 --optShapes=Input:8x3x384x1248 --calib=calib.txt --fp16 --int8 --saveEngine=trt_model.engine
[02/20/2025-14:02:00] [I] === Model Options ===
[02/20/2025-14:02:00] [I] Format: ONNX
[02/20/2025-14:02:00] [I] Model: yolov4_resnet18_epoch_010.onnx
[02/20/2025-14:02:00] [I] Output:
[02/20/2025-14:02:00] [I] === Build Options ===
[02/20/2025-14:02:00] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
[02/20/2025-14:02:00] [I] avgTiming: 8
[02/20/2025-14:02:00] [I] Precision: FP32+FP16+INT8
[02/20/2025-14:02:00] [I] LayerPrecisions:
[02/20/2025-14:02:00] [I] Layer Device Types:
[02/20/2025-14:02:00] [I] Calibration: calib.txt
[02/20/2025-14:02:00] [I] Refit: Disabled
[02/20/2025-14:02:00] [I] Strip weights: Disabled
[02/20/2025-14:02:00] [I] Version Compatible: Disabled
[02/20/2025-14:02:00] [I] ONNX Plugin InstanceNorm: Disabled
[02/20/2025-14:02:00] [I] TensorRT runtime: full
[02/20/2025-14:02:00] [I] Lean DLL Path:
[02/20/2025-14:02:00] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[02/20/2025-14:02:00] [I] Exclude Lean Runtime: Disabled
[02/20/2025-14:02:00] [I] Sparsity: Disabled
[02/20/2025-14:02:00] [I] Safe mode: Disabled
[02/20/2025-14:02:00] [I] Build DLA standalone loadable: Disabled
[02/20/2025-14:02:00] [I] Allow GPU fallback for DLA: Disabled
[02/20/2025-14:02:00] [I] DirectIO mode: Disabled
[02/20/2025-14:02:00] [I] Restricted mode: Disabled
[02/20/2025-14:02:00] [I] Skip inference: Disabled
[02/20/2025-14:02:00] [I] Save engine: trt_model.engine
[02/20/2025-14:02:00] [I] Load engine:
[02/20/2025-14:02:00] [I] Profiling verbosity: 0
[02/20/2025-14:02:00] [I] Tactic sources: Using default tactic sources
[02/20/2025-14:02:00] [I] timingCacheMode: local
[02/20/2025-14:02:00] [I] timingCacheFile:
[02/20/2025-14:02:00] [I] Enable Compilation Cache: Enabled
[02/20/2025-14:02:00] [I] errorOnTimingCacheMiss: Disabled
[02/20/2025-14:02:00] [I] Preview Features: Use default preview flags.
[02/20/2025-14:02:00] [I] MaxAuxStreams: -1
[02/20/2025-14:02:00] [I] BuilderOptimizationLevel: -1
[02/20/2025-14:02:00] [I] Calibration Profile Index: 0
[02/20/2025-14:02:00] [I] Weight Streaming: Disabled
[02/20/2025-14:02:00] [I] Runtime Platform: Same As Build
[02/20/2025-14:02:00] [I] Debug Tensors:
[02/20/2025-14:02:00] [I] Input(s)s format: fp32:CHW
[02/20/2025-14:02:00] [I] Output(s)s format: fp32:CHW
[02/20/2025-14:02:00] [I] Input build shape (profile 0): Input=1x3x384x1248+8x3x384x1248+16x3x384x1248
[02/20/2025-14:02:00] [I] Input calibration shape : Input=1x3x384x1248+8x3x384x1248+16x3x384x1248
[02/20/2025-14:02:00] [I] === System Options ===
[02/20/2025-14:02:00] [I] Device: 0
[02/20/2025-14:02:00] [I] DLACore:
[02/20/2025-14:02:00] [I] Plugins:
[02/20/2025-14:02:00] [I] setPluginsToSerialize:
[02/20/2025-14:02:00] [I] dynamicPlugins:
[02/20/2025-14:02:00] [I] ignoreParsedPluginLibs: 0
[02/20/2025-14:02:00] [I]
[02/20/2025-14:02:00] [I] === Inference Options ===
[02/20/2025-14:02:00] [I] Batch: Explicit
[02/20/2025-14:02:00] [I] Input inference shape : Input=8x3x384x1248
[02/20/2025-14:02:00] [I] Iterations: 10
[02/20/2025-14:02:00] [I] Duration: 3s (+ 200ms warm up)
[02/20/2025-14:02:00] [I] Sleep time: 0ms
[02/20/2025-14:02:00] [I] Idle time: 0ms
[02/20/2025-14:02:00] [I] Inference Streams: 1
[02/20/2025-14:02:00] [I] ExposeDMA: Disabled
[02/20/2025-14:02:00] [I] Data transfers: Enabled
[02/20/2025-14:02:00] [I] Spin-wait: Disabled
[02/20/2025-14:02:00] [I] Multithreading: Disabled
[02/20/2025-14:02:00] [I] CUDA Graph: Disabled
[02/20/2025-14:02:00] [I] Separate profiling: Disabled
[02/20/2025-14:02:00] [I] Time Deserialize: Disabled
[02/20/2025-14:02:00] [I] Time Refit: Disabled
[02/20/2025-14:02:00] [I] NVTX verbosity: 0
[02/20/2025-14:02:00] [I] Persistent Cache Ratio: 0
[02/20/2025-14:02:00] [I] Optimization Profile Index: 0
[02/20/2025-14:02:00] [I] Weight Streaming Budget: 100.000000%
[02/20/2025-14:02:00] [I] Inputs:
[02/20/2025-14:02:00] [I] Debug Tensor Save Destinations:
[02/20/2025-14:02:00] [I] === Reporting Options ===
[02/20/2025-14:02:00] [I] Verbose: Disabled
[02/20/2025-14:02:00] [I] Averages: 10 inferences
[02/20/2025-14:02:00] [I] Percentiles: 90,95,99
[02/20/2025-14:02:00] [I] Dump refittable layers:Disabled
[02/20/2025-14:02:00] [I] Dump output: Disabled
[02/20/2025-14:02:00] [I] Profile: Disabled
[02/20/2025-14:02:00] [I] Export timing to JSON file:
[02/20/2025-14:02:00] [I] Export output to JSON file:
[02/20/2025-14:02:00] [I] Export profile to JSON file:
[02/20/2025-14:02:00] [I]
[02/20/2025-14:02:00] [I] === Device Information ===
[02/20/2025-14:02:00] [I] Available Devices:
[02/20/2025-14:02:00] [I] Device 0: “Orin” UUID: GPU-717317cb-4fe5-582f-bc3e-cab0315e68b4
[02/20/2025-14:02:00] [I] Selected Device: Orin
[02/20/2025-14:02:00] [I] Selected Device ID: 0
[02/20/2025-14:02:00] [I] Selected Device UUID: GPU-717317cb-4fe5-582f-bc3e-cab0315e68b4
[02/20/2025-14:02:00] [I] Compute Capability: 8.7
[02/20/2025-14:02:00] [I] SMs: 8
[02/20/2025-14:02:00] [I] Device Global Memory: 30696 MiB
[02/20/2025-14:02:00] [I] Shared Memory per SM: 164 KiB
[02/20/2025-14:02:00] [I] Memory Bus Width: 256 bits (ECC disabled)
[02/20/2025-14:02:00] [I] Application Compute Clock Rate: 1.3 GHz
[02/20/2025-14:02:00] [I] Application Memory Clock Rate: 0.612 GHz
[02/20/2025-14:02:00] [I]
[02/20/2025-14:02:00] [I] Note: The application clock rates do not reflect the actual clock rates that the GPU is currently running at.
[02/20/2025-14:02:00] [I]
[02/20/2025-14:02:00] [I] TensorRT version: 10.3.0
[02/20/2025-14:02:00] [I] Loading standard plugins
[02/20/2025-14:02:00] [I] [TRT] [MemUsageChange] Init CUDA: CPU +2, GPU +0, now: CPU 31, GPU 3389 (MiB)
[02/20/2025-14:02:02] [I] [TRT] [MemUsageChange] Init builder kernel library: CPU +928, GPU +752, now: CPU 1002, GPU 4184 (MiB)
[02/20/2025-14:02:02] [I] Start parsing network model.
[02/20/2025-14:02:03] [E] [TRT] ModelImporter.cpp:914: Failed to parse ONNX model from file: yolov4_resnet18_epoch_010.onnx!
[02/20/2025-14:02:03] [E] Failed to parse onnx file
[02/20/2025-14:02:03] [I] Finished parsing network model. Parse time: 0.0941306
[02/20/2025-14:02:03] [E] Parsing model failed
[02/20/2025-14:02:03] [E] Failed to create engine from model or file.
[02/20/2025-14:02:03] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v100300] # trtexec --onnx=yolov4_resnet18_epoch_010.onnx --maxShapes=Input:16x3x384x1248 --minShapes=Input:1x3x384x1248 --optShapes=Input:8x3x384x1248 --calib=calib.txt --fp16 --int8 --saveEngine=trt_model.engine

Morganh · February 21, 2025, 5:49am

No. Only need to generate .engine. Copy the onnx file from A5000 to Jetson.

In Jetson, only need to run trtexec against the onnx file you have generated. It will generate tensorrt engine.

Morganh · February 21, 2025, 5:52am

Please make sure you can open it via Netron tool. Then copy this yolov4_resnet18_epoch_010.onnx to Jeson.

ziv9 · February 21, 2025, 8:56am

I’m trying to convert .tlt to .hdf5 following the link you sent
tao_toolkit_recipes/tao_forum_faq/FAQ.md at main · NVIDIA-AI-IOT/tao_toolkit_recipes · GitHub.

I installed successfully nvidia-tao-tf1 but get this error:
ModuleNotFoundError: No module named ‘nvidia_tao_tf1’

Morganh · February 23, 2025, 4:39am

It is not needed to install nvidia_tao_tf1 by yourself.
As mentioned in the link, please run directly inside the docker.

$ docker run --runtime=nvidia -it --rm -v /home/morganh:/home/morganh nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5 /bin/bash

then run the code.

ziv9 · February 23, 2025, 9:42am

hi Mogan,
Using the docker, I run this:

python decode_etlt.py -m sigg_train_000_yolov4_mobilenet_v2_epoch_113.etlt -o sigg_train_000_yolov4_mobilenet_v2_epoch_113.onnx -k nvidia_tlt

It generated a corrupted .onnx file (error in trtexec)

I also checked my_model.onnx -

python -c "import onnx; onnx.checker.check_model(onnx.load('sigg_train_000_yolov4_mobilenet_v2_epoch_113.onnx'))"

Traceback (most recent call last):
File “”, line 1, in
File “/usr/local/lib/python3.8/dist-packages/onnx/init.py”, line 134, in load_model
model = load_model_from_string(s, format=format)
File “/usr/local/lib/python3.8/dist-packages/onnx/init.py”, line 171, in load_model_from_string
return _deserialize(s, ModelProto())
File “/usr/local/lib/python3.8/dist-packages/onnx/init.py”, line 108, in _deserialize
decoded = cast(Optional[int], proto.ParseFromString(s))
google.protobuf.message.DecodeError: Error parsing message with type ‘onnx.ModelProto’

Details -
error in trtexec:

&&&& RUNNING TensorRT.trtexec [TensorRT v100300] # trtexec --onnx=sigg_v1_0.onnx --maxShapes=Input:16x3x384x1248 --minShapes=Input:1x3x384x1248 --optShapes=Input:8x3x384x1248 --fp16 --int8 --saveEngine=trt_model.engine
[02/23/2025-11:08:30] [I] === Model Options ===
[02/23/2025-11:08:30] [I] Format: ONNX
[02/23/2025-11:08:30] [I] Model: sigg_v1_0.onnx
[02/23/2025-11:08:30] [I] Output:
[02/23/2025-11:08:30] [I] === Build Options ===
[02/23/2025-11:08:30] [I] Memory Pools: workspace: default, dlaSRAM: default, dlaLocalDRAM: default, dlaGlobalDRAM: default, tacticSharedMem: default
[02/23/2025-11:08:30] [I] avgTiming: 8
[02/23/2025-11:08:30] [I] Precision: FP32+FP16+INT8
[02/23/2025-11:08:30] [I] LayerPrecisions:
[02/23/2025-11:08:30] [I] Layer Device Types:
[02/23/2025-11:08:30] [I] Calibration: Dynamic
[02/23/2025-11:08:30] [I] Refit: Disabled
[02/23/2025-11:08:30] [I] Strip weights: Disabled
[02/23/2025-11:08:30] [I] Version Compatible: Disabled
[02/23/2025-11:08:30] [I] ONNX Plugin InstanceNorm: Disabled
[02/23/2025-11:08:30] [I] TensorRT runtime: full
[02/23/2025-11:08:30] [I] Lean DLL Path:
[02/23/2025-11:08:30] [I] Tempfile Controls: { in_memory: allow, temporary: allow }
[02/23/2025-11:08:30] [I] Exclude Lean Runtime: Disabled
[02/23/2025-11:08:30] [I] Sparsity: Disabled
[02/23/2025-11:08:30] [I] Safe mode: Disabled
[02/23/2025-11:08:30] [I] Build DLA standalone loadable: Disabled
[02/23/2025-11:08:30] [I] Allow GPU fallback for DLA: Disabled
[02/23/2025-11:08:30] [I] DirectIO mode: Disabled
[02/23/2025-11:08:30] [I] Restricted mode: Disabled
[02/23/2025-11:08:30] [I] Skip inference: Disabled
[02/23/2025-11:08:30] [I] Save engine: trt_model.engine
[02/23/2025-11:08:30] [I] Load engine:
[02/23/2025-11:08:30] [I] Profiling verbosity: 0
[02/23/2025-11:08:30] [I] Tactic sources: Using default tactic sources
[02/23/2025-11:08:30] [I] timingCacheMode: local
[02/23/2025-11:08:30] [I] timingCacheFile:
[02/23/2025-11:08:30] [I] Enable Compilation Cache: Enabled
[02/23/2025-11:08:30] [I] errorOnTimingCacheMiss: Disabled
[02/23/2025-11:08:30] [I] Preview Features: Use default preview flags.
[02/23/2025-11:08:30] [I] MaxAuxStreams: -1
[02/23/2025-11:08:30] [I] BuilderOptimizationLevel: -1
[02/23/2025-11:08:30] [I] Calibration Profile Index: 0
[02/23/2025-11:08:30] [I] Weight Streaming: Disabled
[02/23/2025-11:08:30] [I] Runtime Platform: Same As Build
[02/23/2025-11:08:30] [I] Debug Tensors:
[02/23/2025-11:08:30] [I] Input(s)s format: fp32:CHW
[02/23/2025-11:08:30] [I] Output(s)s format: fp32:CHW
[02/23/2025-11:08:30] [I] Input build shape (profile 0): Input=1x3x384x1248+8x3x384x1248+16x3x384x1248
[02/23/2025-11:08:30] [I] Input calibration shapes: model
[02/23/2025-11:08:30] [I] === System Options ===
[02/23/2025-11:08:30] [I] Device: 0
[02/23/2025-11:08:30] [I] DLACore:
[02/23/2025-11:08:30] [I] Plugins:
[02/23/2025-11:08:30] [I] setPluginsToSerialize:
[02/23/2025-11:08:30] [I] dynamicPlugins:
[02/23/2025-11:08:30] [I] ignoreParsedPluginLibs: 0
[02/23/2025-11:08:30] [I]
[02/23/2025-11:08:30] [I] === Inference Options ===
[02/23/2025-11:08:30] [I] Batch: Explicit
[02/23/2025-11:08:30] [I] Input inference shape : Input=8x3x384x1248
[02/23/2025-11:08:30] [I] Iterations: 10
[02/23/2025-11:08:30] [I] Duration: 3s (+ 200ms warm up)
[02/23/2025-11:08:30] [I] Sleep time: 0ms
[02/23/2025-11:08:30] [I] Idle time: 0ms
[02/23/2025-11:08:30] [I] Inference Streams: 1
[02/23/2025-11:08:30] [I] ExposeDMA: Disabled
[02/23/2025-11:08:30] [I] Data transfers: Enabled
[02/23/2025-11:08:30] [I] Spin-wait: Disabled
[02/23/2025-11:08:30] [I] Multithreading: Disabled
[02/23/2025-11:08:30] [I] CUDA Graph: Disabled
[02/23/2025-11:08:30] [I] Separate profiling: Disabled
[02/23/2025-11:08:30] [I] Time Deserialize: Disabled
[02/23/2025-11:08:30] [I] Time Refit: Disabled
[02/23/2025-11:08:30] [I] NVTX verbosity: 0
[02/23/2025-11:08:30] [I] Persistent Cache Ratio: 0
[02/23/2025-11:08:30] [I] Optimization Profile Index: 0
[02/23/2025-11:08:30] [I] Weight Streaming Budget: 100.000000%
[02/23/2025-11:08:30] [I] Inputs:
[02/23/2025-11:08:30] [I] Debug Tensor Save Destinations:
[02/23/2025-11:08:30] [I] === Reporting Options ===
[02/23/2025-11:08:30] [I] Verbose: Disabled
[02/23/2025-11:08:30] [I] Averages: 10 inferences
[02/23/2025-11:08:30] [I] Percentiles: 90,95,99
[02/23/2025-11:08:30] [I] Dump refittable layers:Disabled
[02/23/2025-11:08:30] [I] Dump output: Disabled
[02/23/2025-11:08:30] [I] Profile: Disabled
[02/23/2025-11:08:30] [I] Export timing to JSON file:
[02/23/2025-11:08:30] [I] Export output to JSON file:
[02/23/2025-11:08:30] [I] Export profile to JSON file:
[02/23/2025-11:08:30] [I]
[02/23/2025-11:08:30] [I] === Device Information ===
[02/23/2025-11:08:30] [I] Available Devices:
[02/23/2025-11:08:30] [I] Device 0: “Orin” UUID: GPU-717317cb-4fe5-582f-bc3e-cab0315e68b4

Morganh · February 24, 2025, 1:55am

You can use above-mentioned recipe to decode .tlt to .hdf5 file.
Then use the latest TAO container to run export to make sure to run above command successfully. That means, export the .hdf5 file to .onnx file.
And make sure you can open this onnx file with Netron tool.
https://netron.app/.

ziv9 · February 24, 2025, 7:15am

Hello Morgan,
decode .tlt to .hdf5 with tao model export does not work - again, it completes with PASS but with NO results (see details below), same as i reported to you in previous message.

i’m working on this task for almost 1 month now, following your instructions but without reaching our goal
I would really appreciate if you could escalate this issue to help us get a better solution/service to resolve this problem.

Details -

# tao <task> export will fail if .onnx already exists. So we clear the export folder before tao <task> export
!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export
# Generate .onnx file using tao container
!tao model yolo_v4 export -m $SPECS_DIR/yolov4_mobilenet_v2_epoch_177.tlt \
                    -o $USER_EXPERIMENT_DIR/export/tlt_test.hdf5 \
                    -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                    --target_opset 12 \
                    --gen_ds_config

2025-02-24 06:04:24,923 [TAO Toolkit] [INFO] root 160: Registry: [‘nvcr.io’] 2025-02-24 06:04:24,983 [TAO Toolkit] [INFO] nvidia_tao_cli.components.instance_handler.local_instance 360: Running command in container: nvcr.io/nvidia/tao/tao-toolkit:5.0.0-tf1.15.5 2025-02-24 06:04:25,018 [TAO Toolkit] [WARNING] nvidia_tao_cli.components.docker_handler.docker_handler 288: Docker will run the commands as root. If you would like to retain your local host permissions, please add the “user”:“UID:GID” in the DockerOptions portion of the “/home/ubuntu/.tao_mounts.json” file. You can obtain your users UID and GID by using the “id -u” and “id -g” commands on the terminal. 2025-02-24 06:04:25,018 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 301: Printing tty value True Using TensorFlow backend. 2025-02-24 06:04:25.766277: I tensorflow/stream_executor/platform/default/dso_loader.cc:50] Successfully opened dynamic library libcudart.so.12 2025-02-24 06:04:25,825 [TAO Toolkit] [WARNING] tensorflow 40: Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. 2025-02-24 06:04:27,153 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1. 2025-02-24 06:04:27,193 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1. 2025-02-24 06:04:27,197 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1. 2025-02-24 06:04:28,739 [TAO Toolkit] [INFO] matplotlib.font_manager 1633: generated new fontManager Using TensorFlow backend. WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them. WARNING:tensorflow:TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1. 2025-02-24 06:04:30,914 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use sklearn by default. This improves performance in some cases. To enable sklearn export the environment variable TF_ALLOW_IOLIBS=1. WARNING:tensorflow:TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1. 2025-02-24 06:04:30,954 [TAO Toolkit] [WARNING] tensorflow 42: TensorFlow will not use Dask by default. This improves performance in some cases. To enable Dask export the environment variable TF_ALLOW_IOLIBS=1. WARNING:tensorflow:TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1. 2025-02-24 06:04:30,957 [TAO Toolkit] [WARNING] tensorflow 43: TensorFlow will not use Pandas by default. This improves performance in some cases. To enable Pandas export the environment variable TF_ALLOW_IOLIBS=1. 2025-02-24 06:04:31,848 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.app 264: Saving exported model to /workspace/tao-experiments/yolo_v4/export/tlt_test.hdf5.onnx 2025-02-24 06:04:31,849 [TAO Toolkit] [INFO] nvidia_tao_tf1.cv.common.export.keras_exporter 119: Setting the onnx export route to keras2onnx Execution status: PASS 2025-02-24 06:04:35,918 [TAO Toolkit] [INFO] nvidia_tao_cli.components.docker_handler.docker_handler 363: Stopping container.

Morganh · February 24, 2025, 8:01am

For “decode .tlt to .hdf5”， please use tao_toolkit_recipes/tao_forum_faq/FAQ.md at main · NVIDIA-AI-IOT/tao_toolkit_recipes · GitHub.

Unfortunately, your command is not correct.
In TAO 4.x docker, for exporting, the input should be .tlt file, the output should be .etlt file.
In TAO 5.x docker, for exporting, the input should be .hdf5 file, the output should be .onnx file.

BTW, you can use tao_tutorials/notebooks/tao_launcher_starter_kit/yolo_v4/yolo_v4.ipynb at main · NVIDIA/tao_tutorials · GitHub to get familiar with it.

Topic		Replies	Views
Deploy TAO Classification_pyt FAN for Jetson Nano TAO Toolkit tensorrt , jetson-inference , tao , deepstream	15	362	April 8, 2024
Unable to deploy TAO 4.0.1 yolov4 model on deepstream6.0 TAO Toolkit deepstream	43	1083	August 18, 2023
Tao Training Model Error TAO Toolkit	7	495	January 15, 2024
Tao Deploying to DeepStream for YOLOv4-tiny TAO Toolkit	6	689	August 25, 2023
Fine-tuned TAO ClassificationTF2 Accuracy Drop after Compiling to TensorRT TAO Toolkit	34	782	August 6, 2024
Unable to export QAT yolov3 in int8 TAO Toolkit	7	552	April 25, 2023
Convert model to Jetson Error during model export step in TAO notebook TAO Toolkit	21	2046	February 15, 2022
[TAO 5] [Object Detection] Can't export a DINO model after training successfully. Missing Layers? TAO Toolkit	19	830	September 29, 2023
Deepstream infrence gives no detection TAO Toolkit	28	1934	December 9, 2021
Error Code 1: Serialization (Serialization assertion stdVersionRead == serializationVersion failed DeepStream SDK	12	2539	March 14, 2023

Tao deploy error -

Related topics