Model.engine is always being built

Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) Jetson Xavier AGX
• DeepStream Version 6.3.0
• JetPack Version (valid for Jetson only) 5.1
• TensorRT Version 8.5.2.2

My model.engine file is always being built, no matter if I set model-engine-file at the config. it says:

WARNING: Deserialize engine failed because file path: /home/ubuntu/EdgeServer/model_b4_gpu0_fp32.engine  open error
0:00:02.515789206 58324 0xffff280022c0 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-gpu-inference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1976> [UID = 1]: deserialize engine from file :/home/ubuntu/EdgeServer/model_b4_gpu0_fp32.engine  failed
0:00:02.564015541 58324 0xffff280022c0 WARN                 nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger:<primary-gpu-inference-engine> NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2081> [UID = 1]: deserialize backend context from engine from file :/home/ubuntu/EdgeServer/model_b4_gpu0_fp32.engine  failed, try rebuild
0:00:02.564395400 58324 0xffff280022c0 INFO                 nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger:<primary-gpu-inference-engine> NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: Trying to create engine from model files

Building the TensorRT Engine

I saw some posts, and at first, the problem should be that the .engine path is not being correctly set, or something related to permissions. But my path and permissions seems ok. This is my environment:

Deepstream is at /opt/nvidia/deepstream/deepstream-6.3
Script runs at /home/ubuntu/EdgeServer/
The config file is at /home/ubuntu/EdgeServer/config
The model engine is being created at /home/ubuntu/EdgeServer/model_b4_gpu0_fp32.engine
In my config file I have:

#This definition overides onnx-file definition
model-engine-file=/home/ubuntu/EdgeServer/model_b4_gpu0_fp32.engine 

#Yolo v5
onnx-file=/home/ubuntu/EdgeServer/model/yolov5s.onnx
labelfile-path=/home/ubuntu/EdgeServer/model/labels_yolov5s.txt
num-detected-classes=80

Note that the warning says failed because file path: /home/ubuntu/EdgeServer/model_b4_gpu0_fp32.engine open error. So tries to read the engine at the same location it was created, which took me to permissions …

DS is being able to write at /home/ubuntu/EdgeServer/, otherwise the model wouldn’t be created. In fact:

ubuntu@ubuntu:~$ ls -l
drwxrwxr-x 11 ubuntu ubuntu 4096 Apr 24 13:23 EdgeServer

and

ubuntu@ubuntu:~/EdgeServer$ ls -l
-rw-rw-r--  1 ubuntu ubuntu 34078941 Apr 24 13:23  model_b4_gpu0_fp32.engine

I have also tried a brute force approach and did ubuntu@ubuntu:~$ sudo chmod -R 777 ~/EdgeServer but the rebuild always occurs. I am missing something here, any suggestions?

Could you attach the whole config file?

We recommend that you build the engine through the config file the first time.If you use a trtexec command, make sure it matches the parameters in the configuration file.

I have tried what you said with deepstream-test1, it worked normally on my side.

model-engine-file: /home/nvidia/EdgeServer/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine
1 Like

I am not using trtexec to build the engine file, it is being built by the config file. This is my config file dstest4_pgie_nvinfer_yolov5_config.txt (1.3 KB)

Concerning the test with deepstream-test-1 you asked for:

  1. Running:python3 deepstream_test_1.py /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p.h264
  2. 1st run, curious: I see the display with the car’s detection but engine was not created at /home/ubuntu/EdgeServer/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine (and anywhere else - I did sudo find / -type f -name resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine) although config file set it to:

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-file=/opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/resnet10.caffemodel
proto-file=/opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/resnet10.prototxt
model-engine-file=/home/ubuntu/EdgeServer/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine
labelfile-path=/opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/labels.txt
int8-calib-file=/opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/cal_trt.bin
force-implicit-batch-dim=1
batch-size=30
process-mode=1
model-color-format=0

0=FP32, 1=INT8, 2=FP16 mode

network-mode=1
num-detected-classes=4
interval=0
gie-unique-id=1
uff-input-order=0
uff-input-blob-name=input_1
output-blob-names=conv2d_bbox;conv2d_cov/Sigmoid
#scaling-filter=0
#scaling-compute-hw=0
cluster-mode=2
infer-dims=3;544;960

[class-attrs-all]
pre-cluster-threshold=0.2
eps=0.2
group-threshold=1

  1. Second run: still trying to create the new model:

ubuntu@ubuntu:~/EdgeServer$ python3 deepstream_test_1.py /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p.h264
Creating Pipeline

Creating Source

Creating H264Parser

Creating Decoder

Creating nv3dsink

Playing file /opt/nvidia/deepstream/deepstream-6.3/samples/streams/sample_720p.h264
Adding elements to Pipeline

Linking elements in the Pipeline

Starting pipeline

Opening in BLOCKING MODE
WARNING: Deserialize engine failed because file path: /home/ubuntu/EdgeServer/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine open error
0:00:02.574614600 65981 0x11c31640 WARN nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::deserializeEngineAndBackend() <nvdsinfer_context_impl.cpp:1976> [UID = 1]: deserialize engine from file :/home/ubuntu/EdgeServer/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine failed
0:00:02.621846541 65981 0x11c31640 WARN nvinfer gstnvinfer.cpp:679:gst_nvinfer_logger: NvDsInferContext[UID 1]: Warning from NvDsInferContextImpl::generateBackendContext() <nvdsinfer_context_impl.cpp:2081> [UID = 1]: deserialize backend context from engine from file :/home/ubuntu/EdgeServer/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine failed, try rebuild
0:00:02.621931281 65981 0x11c31640 INFO nvinfer gstnvinfer.cpp:682:gst_nvinfer_logger: NvDsInferContext[UID 1]: Info from NvDsInferContextImpl::buildModel() <nvdsinfer_context_impl.cpp:2002> [UID = 1]: Trying to create engine from model files
WARNING: [TRT]: The implicit batch dimension mode has been deprecated. Please create the network with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag whenever possible.

The model engine file will be generated in the model-file directory, not the directory you configured by the model-engine-file. You need set the model-engine-file directory the same with the model-file field, just change the file name.

Understood, I confirm that the engine was created at the model-file directory.

ubuntu@ubuntu:~/EdgeServer$ ls -la /opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/
total 8000
drwxrwxrwx 2 root   root      4096 Apr 26 13:03 .
drwxr-xr-x 8 root   root      4096 Apr  4 12:54 ..
-rw-r--r-- 1 root   root      1126 Dec 31  1969 cal_trt.bin
-rw-r--r-- 1 root   root        28 Dec 31  1969 labels.txt
-rw-r--r-- 1 root   root   6244865 Dec 31  1969 resnet10.caffemodel
-rw-rw-r-- 1 ubuntu ubuntu 1917401 Apr 26 13:18 resnet10.caffemodel_b30_gpu0_int8.engine
-rw-r--r-- 1 root   root      7605 Dec 31  1969 resnet10.prototxt

However, when running deepstream_test_1, I still receive that warning message and it still rebuilding the model. Despite I did ubuntu@ubuntu:~$ sudo chmod -R 777 /opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/, and the config file now has these two properties with the same directory:

model-file=/opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/resnet10.caffemodel
model-engine-file=/opt/nvidia/deepstream/deepstream-6.3/samples/models/Primary_Detector/resnet18_trafficcamnet.etlt_b1_gpu0_int8.engine

I have a issue here. We can move on trying not to rebuild the engine from deepstream_test_1, which solution will probably be related to permissions. But even solving it, I will keep having a problem.

Note that my config file loads the model from an onnx file. Therefore, I use onnx-file, and model-file doesn’t exists in my config file. So, what you think about trying not to rebuild the model from my environment? Do you believe it better to address the problem with deepstream_app_1+my config, or to address directly my config?

Address directly in your config. DeepStream_app_1 is just a test demo for your issue. You can refer to our demo pgie_yolov5_config.txt to learn how to set the config file for onnx model.

Yes my config sets exactly the same properties from pgie_yolov5_config.txt. I checked the path of each property, and also checked mistypings. But it keeps rebuilding.

This is my config:

[property]
net-scale-factor=0.0039215697906911373
#model-color-format: 0=RGB, 1=BGR, 2=Gray
model-color-format=0

yolo v5
onnx-file=/home/ubuntu/EdgeServer/model/yolov5s.onnx
labelfile-path=/home/ubuntu/EdgeServer/model/labels_yolov5s.txt
num-detected-classes=80

#This definition overides onnx-file definition
model-engine-file=/home/ubuntu/EdgeServer/model_b4_gpu0_fp32.engine

#infer-dims=3;672;672
#batch-size: Usually overrided by the script, remember that the model must be exported with --dynamic to allow n batches
batch-size=1
#network-mode: 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
#interval: Number of frames to skip without making inference
interval=0
gie-unique-id=1
process-mode=1
network-type=0
#is-classifier=0
#cluster-mode: 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
maintain-aspect-ratio=1
symmetric-padding=1

parse-bbox-func-name=NvDsInferParseYolo
custom-lib-path=/home/ubuntu/EdgeServer/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
pre-cluster-threshold=0.2
eps=0.2
group-threshold=1
topk=300
#roi-top-offset=0
#roi-bottom-offset=0
#detected-min-w=0
#detected-min-h=0
#detected-max-w=0
#detected-max-h=0

And this is pgie_yolov5_config.txt:

[property]
gpu-id=0
net-scale-factor=0.0039215686
offsets=0;0;0
model-color-format=0
labelfile-path=yolov5_labels.txt
model-engine-file=…/…/…/models/yolov5/yolov5s.onnx_b1_gpu0_fp16.engine
onnx-file=…/…/…/models/yolov5/yolov5s.onnx
infer-dims=3;672;672
maintain-aspect-ratio=1
batch-size=1
##0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
interval=0
gie-unique-id=1
is-classifier=0
#network-type=0
cluster-mode=3
output-blob-names=BatchedNMS
parse-bbox-func-name=NvDsInferParseCustomBatchedYoloV5NMSTLT
custom-lib-path=…/…/…/post_processor/libnvds_infercustomparser_tao.so

[class-attrs-all]
pre-cluster-threshold=0.5
roi-top-offset=0
roi-bottom-offset=0
detected-min-w=0
detected-min-h=0
detected-max-w=0
detected-max-h=0

So, the properties are the same, path are correct, filenames too, I changed filesystem permission to 777. I am wondering if there is some kind of caching variables, environment variables, orelse, that might be interfering.

NO. Just from your config file, you need to set the batch size to 4. Could you just run with the generated engine like yolov5s.onnx_b1_gpu0_fp16.engine(please refer to the name generated in your own environment) without defining the engine name yourself?

I made many combinations in config file, and it came that I manage to make it work by changing absolute path, to relative path. But it still seems strange to me.

onnx-file=…/model/yolov5s.onnx
labelfile-path=…/model/labels_yolov5s.txt
model-engine-file=…/model_b4_gpu0_fp32.engine

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.