Can't load trt engine and throwing an instance of 'nvinfer1::MyelinError'

3204657659 · April 28, 2021, 5:32am

Hi, I am deploying LPR model downloaded from NGC in python. When I load the engine
trt_engine = load_engine(trt_runtime, trt_engine_path)

it says:
[TensorRT] ERROR: myelin/myelinGraphContext.h (26) - Myelin Error in MyelinGraphContext: 66 (myelinBinaryVersionMismatch : myelinGraphDeserializeBinary called with a buffer that’s not a Myelin binary (invalid version)
)
terminate called after throwing an instance of ‘nvinfer1::MyelinError’
what(): std::exception

My .trt file is downloaded from NGC(wget https://api.ngc.nvidia.com/v2/models/nvidia/tlt_lprnet/versions/deployable_v1.0/files/us_lprnet_baseline18_deployable.etlt).
And I converter it using tlt
tlt-converter -k nvidia_tlt -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 ./us_lprnet_baseline18_deployable.etlt -t int8 -e ./lpr_us_onnx_int8.trt -w 700000000

I run everything in the container.

Morganh · April 28, 2021, 6:37am

You mention that you run everything in the container.
Which container did you run?

3204657659 · April 28, 2021, 7:08am

The .etlt file is converted in the tlt container nvcr.io/nvidia/tlt-streamanalytics:v3.0-dp-py3
I deploy the .trt file in the tensorRT20.10 container nvcr.io/nvidia/tensorrt:20.10-py3

Morganh · April 28, 2021, 7:37am

Please generate .trt file directly inside the tensorRT20.10 container.
Firstly, copy .etlt file to that container and then download the tlt-converter according to Overview — TAO Toolkit 3.22.05 documentation

3204657659 · April 28, 2021, 12:57pm

I accept your advice and converter the .etlt file in the tensorRT20.10 container by
./tlt-converter -k nvidia_tlt -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 ./us_lprnet_baseline18_deployable.etlt -t int8 -e ./lpr_us_onnx_int8.trt

However, I found the model size is wrong when deploying the .trt. The error information is:
Traceback (most recent call last):
File “trt_old.py”, line 243, in
inputs, outputs, bindings, stream = allocate_buffers(trt_engine)
File “trt_old.py”, line 66, in allocate_buffers
host_mem = cuda.pagelocked_empty(size, dtype)
pycuda._driver.MemoryError: cuMemHostAlloc failed: out of memory

I found the engine.binding.shape is (-1, 3, 48, 96). The batch size can’t be -1 and I don’t know why.
Others also come across the same question as the post says:

Morganh · April 28, 2021, 4:09pm

So, can I say that you meet the same issue as Python run LPRNet with TensorRT ? What is your inference code?

3204657659 · April 29, 2021, 1:57am

Yes, the same issue.
My inference code is the same as Python run LPRNet with TensorRT. And I test the code with the LPD model (downloaded from the NGC and convertered using tlt-converter in the way as the processing of LPR model) to make sure the inference code works well.

3204657659 · April 29, 2021, 6:02am

Can you reproduce my issue? Is it the problem of the LPR model or the tlt-converter tool?

Morganh · April 29, 2021, 6:14am

For LPR model, it is not based on TLT detectnet_v2. So, it is different from LPD model. See Overview — TAO Toolkit 3.22.05 documentation

So, for the trt engine you have generated, please use official inference command to check firstly.
See LPRNet — Transfer Learning Toolkit 3.0 documentation

3204657659 · April 30, 2021, 7:13am

I do inference in the tlt3.0 container and show the same error:
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2021-04-30 06:55:59,865 [INFO] /usr/local/lib/python3.6/dist-packages/iva/lprnet/utils/spec_loader.pyc: Merging specification from /workspace/specs/lpr_spec.txt
[TensorRT] ERROR: myelin/myelinGraphContext.h (26) - Myelin Error in MyelinGraphContext: 66 (myelinBinaryVersionMismatch : myelinGraphDeserializeBinary called with a buffer that’s not a Myelin binary (invalid version)
)
terminate called after throwing an instance of ‘nvinfer1::MyelinError’
what(): std::exception
Aborted (core dumped)
Traceback (most recent call last):
File “/usr/local/bin/lprnet”, line 8, in
sys.exit(main())
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/entrypoint/lprnet.py”, line 12, in main
File “/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/entrypoint/entrypoint.py”, line 296, in launch_job
AssertionError: Process run failed.

I also find the batchsize is -1 in the convertering process.
[WARNING] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] Tensor DataType is determined at build time for tensors not marked as input or output.
[INFO] Detected input dimensions from the model: (-1, 3, 48, 96)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 48, 96) for input: image_input
[INFO] Using optimization profile opt shape: (4, 3, 48, 96) for input: image_input
[INFO] Using optimization profile max shape: (16, 3, 48, 96) for input: image_input
[INFO] Detected 1 inputs and 2 output network tensors.

I am afraid that there is something wrong with the us_lprnet_baseline18_deployable.etlt downloaded from the NGC.
wget https://api.ngc.nvidia.com/v2/models/nvidia/tlt_lprnet/versions/deployable_v1.0/files/us_lprnet_baseline18_deployable.etlt

Morganh · April 30, 2021, 7:20am

I do not think so. The etlt file should be fine. Some users can deploy it and run inference successfully. See

Can you share the full command you run inference?

3204657659 · April 30, 2021, 8:08am

The model converter :

root@99e6798fbdc8:/workspace/lpr# tlt-converter -k nvidia_tlt -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 ./us_lprnet_baseline18_deployable.etlt -t int8 -e ./lpr_us_onnx_int8_old.trt
[WARNING] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] Tensor DataType is determined at build time for tensors not marked as input or output.
[INFO] Detected input dimensions from the model: (-1, 3, 48, 96)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 48, 96) for input: image_input
[INFO] Using optimization profile opt shape: (4, 3, 48, 96) for input: image_input
[INFO] Using optimization profile max shape: (16, 3, 48, 96) for input: image_input
[INFO] Detected 1 inputs and 2 output network tensors.

The model inference:

root@99e6798fbdc8:/workspace/lpr# lprnet inference --gpu_index=0 -m lpr_us_onnx_int8_new.trt -i car1.jpg -e /workspace/specs/lpr_spec.txt --trt
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2021-04-30 08:05:19,440 [INFO] /usr/local/lib/python3.6/dist-packages/iva/lprnet/utils/spec_loader.pyc: Merging specification from /workspace/specs/lpr_spec.txt
[TensorRT] ERROR: myelin/myelinGraphContext.h (26) - Myelin Error in MyelinGraphContext: 66 (myelinBinaryVersionMismatch : myelinGraphDeserializeBinary called with a buffer that's not a Myelin binary (invalid version)
)
terminate called after throwing an instance of 'nvinfer1::MyelinError'
  what():  std::exception
Aborted (core dumped)
Traceback (most recent call last):
  File "/usr/local/bin/lprnet", line 8, in <module>
    sys.exit(main())
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/entrypoint/lprnet.py", line 12, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/216c8b41e526c3295d3b802489ac2034/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/entrypoint/entrypoint.py", line 296, in launch_job
AssertionError: Process run failed.

I run all in the tlt3.0 container.

Morganh · April 30, 2021, 9:11am

Please download below version of tlt-converter and use it to generate the trt engine again.

wget https://developer.nvidia.com/cuda111-cudnn80-trt72
unzip cuda111-cudnn80-trt72
chmod +x tlt-converter

See Overview - NVIDIA Docs

3204657659 · May 6, 2021, 2:18am

I try again following your advice but failed again. It shows the same error information.

Morganh · May 6, 2021, 8:55am

I cannot reproduce the error. My steps are as below. Can you double check again?

Step:
$ tlt lprnet run /bin/bash

root@32b0be3ea045:/workspace/demo_2.0/lprnet# wget https://api.ngc.nvidia.com/v2/models/nvidia/tlt_lprnet/versions/deployable_v1.0/files/us_lprnet_baseline18_deployable.etlt

root@32b0be3ea045:/workspace/demo_2.0/lprnet# tlt-converter -k nvidia_tlt -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 ./us_lprnet_baseline18_deployable.etlt -t int8 -e ./lpr_us_onnx_int8.trt
[WARNING] onnx2trt_utils.cpp:220: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[WARNING] Tensor DataType is determined at build time for tensors not marked as input or output.
[INFO] Detected input dimensions from the model: (-1, 3, 48, 96)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 48, 96) for input: image_input
[INFO] Using optimization profile opt shape: (4, 3, 48, 96) for input: image_input
[INFO] Using optimization profile max shape: (16, 3, 48, 96) for input: image_input
[WARNING] Half2 support requested on hardware without native FP16 support, performance will be negatively affected.
[INFO] Detected 1 inputs and 2 output network tensors.

root@32b0be3ea045:/workspace/demo_2.0/lprnet# lprnet inference -m lpr_us_onnx_int8.trt -i /workspace/demo_2.0/lprnet/data/openalpr/train/image -e /workspace/examples/lprnet/specs/tutorial_spec.txt --trt
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2021-05-06 08:51:54,636 [INFO] /usr/local/lib/python3.6/dist-packages/iva/lprnet/utils/spec_loader.pyc: Merging specification from /workspace/examples/lprnet/specs/tutorial_spec.txt
Using TRT engine for inference, setting batch size to the one in eval_config: 1
/workspace/demo_2.0/lprnet/data/openalpr/train/image/wts-lg-000178.jpg:6LSU216
/workspace/demo_2.0/lprnet/data/openalpr/train/image/car9-1.jpg:ASC7399
/workspace/demo_2.0/lprnet/data/openalpr/train/image/wts-lg-000189.jpg:FK4W3L
/workspace/demo_2.0/lprnet/data/openalpr/train/image/wts-lg-000171.jpg:DCK6344

3204657659 · May 6, 2021, 10:09am

I tried your steps and succeeded. I think the key is tlt-converter. I try to converter using the tlt-converter tool in the tlt3.0 container and succeed in the inference process.
But I also try to converter using the cuda111-cudnn80-trt72 tool and failed.

Morganh · May 6, 2021, 10:15am

Thanks for the info. I will close this topic.

Topic		Replies	Views
Convert tensorrt engine from version 7 to 8 TAO Toolkit tensorrt	66	5290	May 11, 2021
LPRNet can't use exported engine file TAO Toolkit	17	2820	December 9, 2021
LPRNet can’t use exported engine file TAO Toolkit	14	1011	January 5, 2022
Lprnet: Failed to run the tensorrt engine verification TAO Toolkit	7	1389	July 27, 2021
How to load a etlt model in python script TAO Toolkit	16	4178	January 28, 2021
TLT-converter for HeartRateNet model error TAO Toolkit	19	1133	July 20, 2021
Cannot convert FasterRCNN TLT model to trt engine TAO Toolkit	8	1245	December 20, 2019
Transfer Learning Toolkit v3.0 trtexec loading TAO Toolkit	6	1447	August 25, 2021
Error loading .trt model Jetson AGX Orin tensorrt	6	406	October 23, 2024
TLT to TensorRT engine TAO Toolkit	1	697	October 21, 2020

Can't load trt engine and throwing an instance of 'nvinfer1::MyelinError'

Related topics