Problem in tlt export

Hi.
According to the documentation and select the first option for ingesting training data, we do not need to include
**--cal_image_dir** and **--cal_data_file**.

https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/object_detection/fasterrcnn.html#int8-mode-overview

so I executed this command:
https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/object_detection/fasterrcnn.html#exporting-a-model

But I got this error:

Traceback (most recent call last):
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/yolo_v3/scripts/export.py", line 12, in <module>
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 219, in launch_export
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 201, in run_export
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/keras_exporter.py", line 360, in export
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py", line 184, in get_calibrator
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py", line 284, in generate_tensor_file
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/base_exporter.py", line 327, in generate_random_tensorfile
File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/tensorfile.py", line 54, in __init__
File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 312, in __init__
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "/usr/local/lib/python3.6/dist-packages/h5py/_hl/files.py", line 148, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 98, in h5py.h5f.create
ValueError: Invalid file name (invalid file name)
Using TensorFlow backend.
write export results to /workspace/tlt/local_dir/tlt_yolov3_pv_resnet18/unpruned_model/export

Can you share your command?

1 Like

This is my command:

export_cmd = f'''{algorithm_name} export -m {final_model}  \
                        -o {output_folder_address}/final_model.etlt \
                        -e {spec_dir} \
                        -k $KEY \
                        --data_type {data_type} \
                        --batch_size {batch_size} \
                        --batches {batches} \
                        --cal_cache_file {output_folder_address}/cal.bin  \
                        --engine_file {output_folder_address}/trt.engine \
                        --gpu_index {gpu_num}'''

Can you double check below?
-m {final_model}

To narrow down, I suggest you run the explicit command with “tlt faster_rcnn export xxx” .

Thanks for help.
I use tlt version3 inside the docker and I wrote all task commands in the same way (train, evaluate, inference, prune) Also there is no problem when I set --cal_image_dir and --cal_data_file. But I think these are not needed.

So, can you run the explicit command with “faster_rcnn export xxx” inside the docker?

I executed this command. But I got the same error:

!yolo_v3 export -m /workspace/tlt/local_dir/tlt_yolov3_pv_resnet18/unpruned_model/weights/final_model.tlt \
-o /workspace/tlt/local_dir/tlt_yolov3_pv_resnet18/unpruned_model/export/final_model.etlt \
-e /workspace/tlt/local_dir/tlt_yolov3_pv_resnet18/unpruned_model/final_spec.txt \
-k $KEY \
--data_type int8 \
--batch_size 2 \
--batches 10 \
--cal_cache_file /workspace/tlt/local_dir/tlt_yolov3_pv_resnet18/unpruned_model/export/cal.bin \
--engine_file /workspace/tlt/local_dir/tlt_yolov3_pv_resnet18/unpruned_model/export/trt.engine \
--gpu_index 0

For yolo_v3, can you follow YOLOv3 — Transfer Learning Toolkit 3.0 documentation and retry?

Yes, it works according to yolov3 documentation:
https://docs.nvidia.com/tlt/tlt-user-guide/text/object_detection/yolo_v3.html#id12
But since it is also mentioned that we can use option 1 and using the training data loader, I thought the export is the same for all object detection models.
https://docs.nvidia.com/tlt/tlt-user-guide/text/object_detection/yolo_v3.html#int8-mode-overview
I am new to TLT. Excuse me if my questions are confusing.

I executed this command:

!faster_rcnn export -m /workspace/tlt/local_dir/tlt_faster_rcnn_resnet18/unpruned_model/weights/final_model.tlt \
-o /workspace/tlt/local_dir/tlt_faster_rcnn_resnet18/unpruned_model/export/final_model.etlt \
-e /workspace/tlt/local_dir/tlt_faster_rcnn_resnet18/unpruned_model/final_spec.txt \
-k $KEY \
--data_type int8 \
--batch_size 2 \
--batches 10 \
--cal_image_dir /workspace/tlt/tlt-experiments/all_object_detection_approaches/dataset_1_temp/train/images \
--cal_data_file /workspace/tlt/local_dir/tlt_faster_rcnn_resnet18/unpruned_model/export/cal.tensorfile \
--cal_cache_file /workspace/tlt/local_dir/tlt_faster_rcnn_resnet18/unpruned_model/export/cal.bin  \
--engine_file /workspace/tlt/local_dir/tlt_faster_rcnn_resnet18/unpruned_model/export/trt.engine \
--force_ptq \
--gpu_index 1

and I got this error:

Traceback (most recent call last):
  File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/scripts/export.py", line 12, in <module>
  File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 219, in launch_export
  File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/app.py", line 201, in run_export
  File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/keras_exporter.py", line 360, in export
TypeError: get_calibrator() got an unexpected keyword argument 'image_mean'

The etlt model is generated successfully.

But the cal.bin is not generated successfully due to above error. It is a regression issue in tlt 3.0-py3.
For workaround, please login 3.0-dp-py3 docker and run the same command to generate cal.bin.
Sorry for inconvenience.

Thank you. I will try

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.