Lprnet: Failed to run the tensorrt engine verification

Kaka_m · July 26, 2021, 6:51am

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) : A100 PCIe
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) : lprnet
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here): v3.0-py3.0
• Training spec file(If have, please share here): Use the “tutorial_spec.txt” on sample code.
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
It just run the following code after “Export in FP32 mode”.

!tlt lprnet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/lprnet_epoch-24.tlt \
                   -k $KEY \
                   -e $SPECS_DIR/tutorial_spec.txt \
                   -o $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.etlt \
                   --data_type fp32 \
                   --engine_file $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.engine

    Using TensorFlow backend.
    WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
    Using TensorFlow backend.
    2021-07-26 05:12:06,714 [INFO] iva.common.export.keras_exporter: Using input nodes: ['image_input']
    2021-07-26 05:12:06,714 [INFO] iva.common.export.keras_exporter: Using output nodes: ['tf_op_layer_ArgMax', 'tf_op_layer_Max']
    2021-07-26 05:12:06,714 [INFO] iva.lprnet.utils.spec_loader: Merging specification from /workspace/data/lprnet/specs/tutorial_spec.txt
    The ONNX operator number change on the optimization: 132 -> 61
    2021-07-26 05:12:18,205 [INFO] keras2onnx: The ONNX operator number change on the optimization: 132 -> 61
    Traceback (most recent call last):
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/export.py", line 215, in <module>
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/export.py", line 142, in main
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/export.py", line 211, in run_export
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/keras_exporter.py", line 371, in export
    TypeError: set_data_preprocessing_parameters() got an unexpected keyword argument 'image_mean'

It showed the error but actually, the notebook was created the “lprnet_epoch-24.etlt”.
So, I ignored this error but I met the another error at the next evaluate step on notebook.

# Verify the tensorrt engine accuracy on the validation dataset
!tlt lprnet evaluate --gpu_index=$GPU_INDEX -e $SPECS_DIR/tutorial_spec.txt \
                     -m $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.engine \
                     --trt
    Using TensorFlow backend.
    WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
    Using TensorFlow backend.
    2021-07-26 05:16:47,074 [INFO] iva.lprnet.utils.spec_loader: Merging specification from /workspace/data/lprnet/specs/tutorial_spec.txt
    Traceback (most recent call last):
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/evaluate.py", line 152, in <module>
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/evaluate.py", line 148, in main
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/evaluate.py", line 105, in evaluate
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/trt_inferencer.py", line 31, in __init__
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/engine.py", line 113, in load_engine
    FileNotFoundError: [Errno 2] No such file or directory: '/workspace/data/lprnet/export/lprnet_epoch-24.engine'
    Exception ignored in: <bound method TRTInferencer.__del__ of <iva.common.inferencer.trt_inferencer.TRTInferencer object at 0x7fdb63ddc518>>
    Traceback (most recent call last):
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/trt_inferencer.py", line 139, in __del__
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/trt_inferencer.py", line 96, in clear_trt_session
    AttributeError: 'TRTInferencer' object has no attribute 'context'

It seems that the model name was not match on example. So, I modified the model name at “-m” option like below but not fixed the issue.

!tlt lprnet evaluate --gpu_index=$GPU_INDEX -e $SPECS_DIR/tutorial_spec.txt \
                     -m $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.etlt \
                     --trt

    Using TensorFlow backend.
    WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
    Using TensorFlow backend.
    2021-07-26 06:49:57,938 [INFO] iva.lprnet.utils.spec_loader: Merging specification from /workspace/data/lprnet/specs/tutorial_spec.txt
    [TensorRT] ERROR: coreReadArchive.cpp (32) - Serialization Error in verifyHeader: 0 (Magic tag does not match)
    [TensorRT] ERROR: INVALID_STATE: std::exception
    [TensorRT] ERROR: INVALID_CONFIG: Deserialize the cuda engine failed.
    Traceback (most recent call last):
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/evaluate.py", line 152, in <module>
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/evaluate.py", line 148, in main
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/evaluate.py", line 105, in evaluate
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/trt_inferencer.py", line 32, in __init__
    AttributeError: 'NoneType' object has no attribute 'max_batch_size'
    Exception ignored in: <bound method TRTInferencer.__del__ of <iva.common.inferencer.trt_inferencer.TRTInferencer object at 0x7f13cb431550>>
    Traceback (most recent call last):
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/trt_inferencer.py", line 139, in __del__
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/inferencer/trt_inferencer.py", line 96, in clear_trt_session
    AttributeError: 'TRTInferencer' object has no attribute 'context'

If you have any questions, please let me know.
Best reagrds.
Kaka

Morganh · July 26, 2021, 7:02am

Please debug with below command in terminal instead of jupyter notebook.

$ tlt lprnet run ls /workspace/data/lprnet/export/lprnet_epoch-24.engine

If not available, please check your ~/.tlt_mounts.json file.
And login the docker to find where is the engine.

$ tlt lprnet run /bin/bash

Kaka_m · July 26, 2021, 7:30am

Thank you for your quick response.
It did not present the ”lprnet_epoch-24.engine” on the specified path at docker container. And I could not find this file on docker container.
Which step will create it?

Best regards.
Kaka

Morganh · July 26, 2021, 7:55am

According to your log, the engine should be generated in

!tlt lprnet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/lprnet_epoch-24.tlt
-k $KEY
-e $SPECS_DIR/tutorial_spec.txt
-o $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.etlt
–data_type fp32
–engine_file $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.engine

Please check if it is available.
For debug use, you can run below and generate the file inside the docker.

$ tlt lprnet run /bin/bash

# lprnet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/lprnet_epoch-24.tlt
-k $KEY
-e $SPECS_DIR/tutorial_spec.txt
-o $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.etlt
–data_type fp32
–engine_file $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.engine

Kaka_m · July 26, 2021, 8:03am

I see. I tried the export command on docker container but the engine file did not generate by the following command.

!lprnet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/lprnet_epoch-24.tlt \
                   -k $KEY \
                   -e $SPECS_DIR/tutorial_spec.txt \
                   -o $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.etlt \
                   --data_type fp32 \
                   --engine_file $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.engine

    Using TensorFlow backend.
    WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
    Using TensorFlow backend.
    2021-07-26 08:04:12,033 [INFO] iva.common.export.keras_exporter: Using input nodes: ['image_input']
    2021-07-26 08:04:12,033 [INFO] iva.common.export.keras_exporter: Using output nodes: ['tf_op_layer_ArgMax', 'tf_op_layer_Max']
    2021-07-26 08:04:12,033 [INFO] iva.lprnet.utils.spec_loader: Merging specification from /workspace/data/lprnet/specs/tutorial_spec.txt
    The ONNX operator number change on the optimization: 132 -> 61
    2021-07-26 08:04:23,676 [INFO] keras2onnx: The ONNX operator number change on the optimization: 132 -> 61
    Traceback (most recent call last):
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/export.py", line 215, in <module>
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/export.py", line 142, in main
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/export.py", line 211, in run_export
      File "/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/export/keras_exporter.py", line 371, in export
    TypeError: set_data_preprocessing_parameters() got an unexpected keyword argument 'image_mean'

Regarding to another thread, you explained that we could ignore this error. But it seems that it failed to create this engine file by this command.

If you have any question, please let me know.
Best regards
Kaka

Morganh · July 26, 2021, 8:24am

Yes, it is an issue.
For workaround, please use tlt-converter to generate trt engine for lprnet.
See GitHub - NVIDIA-AI-IOT/deepstream_lpr_app: Sample app code for LPR deployment on DeepStream
For example,

 ./tlt-converter -k nvidia_tlt -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 \
           models/LP/LPR/us_lprnet_baseline18_deployable.etlt -t fp16 -e models/LP/LPR/lpr_us_onnx_b16.engine

Kaka_m · July 27, 2021, 1:36am

Hi

I am confusing… Does the following export command the same as the tlt-converter command?

lprnet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/lprnet_epoch-24.tlt \
                   -k $KEY \
                   -e $SPECS_DIR/tutorial_spec.txt \
                   -o $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.etlt \
                   --data_type fp32 \
                   --engine_file $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.engine

JFYI.
Regarding to the below “tlt-converter” command, it passed without any issue.

!tlt tlt-converter $USER_EXPERIMENT_DIR/export/lprnet_epoch-24.etlt \
                   -k $KEY \
                   -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 \
                   -t fp16 \
                   -e $USER_EXPERIMENT_DIR/export/lprnet_epoch-24_dynamic_batch.engine

Best reagrds.
Kaka

Morganh · July 27, 2021, 1:39am

No, it is not the same.
The tlt-converter will generate trt engine based on etlt model.
The “export” will generate etlt model based on tlt model. In lprnet, it can generate trt enigne too.

system · October 2, 2021, 3:25am

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
LPRNet can’t use exported engine file TAO Toolkit	15	795	January 19, 2022
LPRNet can't use exported engine file TAO Toolkit	18	2576	December 28, 2021
Can't load trt engine and throwing an instance of 'nvinfer1::MyelinError' TAO Toolkit	17	2790	October 12, 2021
Tlt lprnet export error, TypeError: set_data_preprocessing_parameters() got an unexpected keyword argument 'image_mean' TAO Toolkit	7	1270	October 12, 2021
Tlt-converter does not return the engine file TAO Toolkit	3	439	October 12, 2021
Convert tensorrt engine from version 7 to 8 TAO Toolkit tensorrt	67	4466	October 12, 2021
Tlt unet evaluate failed TAO Toolkit	10	524	September 18, 2021
Error at exporting to TRT engine in TLT TAO Toolkit	7	956	August 24, 2021
TensorRT Inference Server rejecting valid trt.engine file generated by TLT Triton Inference Server (archived)	0	700	August 16, 2020
Cannot convert FasterRCNN TLT model to trt engine TAO Toolkit	9	1120	October 12, 2021

Lprnet: Failed to run the tensorrt engine verification

Related topics