Error while running converted engine on NX-Xavier Deepstream App

Hi Team,

I am using Transfer learning toolkit - v2.0_py3 and training classifier with input_image_size: “3,145,350”. But after generation of engine using below command.
tlt-converter final_model_282.etlt -k tlt_encode -c final_model_int8_cache_282.bin -o predictions/Softmax -d 3,145,350 -i nchw -e Age_ep177_tlt7.engine -m 64 -t int8 -b 64 and then using it in Deep-stream application it gives error message.

but with the param -d 3,224,224 during engine file generation I was able to generate engine file successful and it is running fine with my deep-stream application as well.

So what is the reason of that ? can you please suggest where are the gaps.?

Thanks.

How about tlt-infer? Can you run tlt-infer successfully with “3,145,350”?

Yes Morganh I am able run tlt-infer on TLT-v2.0_py3 on my training Machine and it also generating CSV file.

Can you share all the config file when you run DS? Please attach command and full log too. Thanks.

I think I find the root cause of your error.
Please modify your command as below.

-d 3,145,350

to

-d 3,350,145

Thanks Morganh for reply. I have modified my command as per the suggestion but I have got an error.

with dimension -d 3,350,145 I am getting error message while converting engine on NX.

with dimention -d 3,145,350 I am able to generate the engine

but when I run this engine in DS it gives error as I had mentioned.

If your training images are 350x145, please set

-d 3,145,350

during tlt-converter.

Also, when you run DS, please set DS config file

input-dims=3;145;350;0

Thanks Morganh for the reply,
Will try with that.

Any update? Could you provide your ds config file too?

Yes Morganh I am attaching my conf file below.

When I use input size 3,h,w (h=w) then I get no issue in DS app as I have analysed. but with input 3,h,w (h>w or h<w) I got issues.

**[property]
gpu-id=0
net-scale-factor=1
#model-engine-file=./Model/age_int8_tlt7_v_2.engine
model-engine-file=./Model/Age_int8_ep177_tlt7.engine

labelfile-path=./Model/age_classification_label.txt
batch-size=1

0=FP32 and 1=INT8 mode

network-mode=1
input-object-min-width=0
input-object-min-height=0
process-mode=2
model-color-format=1
gpu-id=0
gie-unique-id=4
operate-on-gie-id=1
operate-on-class-ids=0
is-classifier=1
output-blob-names=predictions/Softmax
classifier-async-mode=1
classifier-threshold=0.20
#secondary-reinfer-interval=10**

I want to confirm below works on your side.

  1. Your training images 350x145 (width x height)
  2. In training spec, please set input_image_size: “3,145,350”
  3. After training done, please set “-d 3,145,350” during tlt-converter.
  4. when you run DS, please set DS config file

input-dims=3;145;350;0

Yes Have done the same things. let me verify it again.

Thanks.

More, it is necessary to set input-dims explicitly in your ds config file.

input-dims=c;h;w;0 # where c = number of channels, h = height of the model input, w = width of model input, 0: implies CHW format.
uff-input-blob-name=input_1
output-blob-names=predictions/Softmax #output node name for classification

See more details in https://pgambrill.gitlab-master-pages.nvidia.com/tlt-docs/text/deploying_to_deepstream.html#deepstream-configuration-file