Getting Different result of same tlt-encoded-model on different platform

Hi
We have done training on 2080 TI and get .etlt and .bin file after that we generate engine file via deepstream sdk on two different platform 2080TI and jetson xavier nx on both platform getting diffrent result on same video and same configuration file,Please help us to find what reason behind it below is some details about congifuration
Deepstream SDK 5.0
Model Training Environment (2080 TI)

[property]
gpu-id=0
net-scale-factor=1
int8-calib-file=age.bin
tlt-encoded-model=age.etlt
tlt-model-key="abcd1234"
input-dims=3;244;244;0
uff-input-blob-name=input_1
output-blob-names=predictions/Softmax 
labelfile-path=age_classification.txt
batch-size=1
network-mode=1
input-object-min-width=0
input-object-min-height=0
process-mode=2
model-color-format=1
gpu-id=0
gie-unique-id=4
operate-on-gie-id=1
operate-on-class-ids=2
is-classifier=1 

Thanks.

1 Like

Hi,

I am also getting same kind of issue…

Hi,
I am also facing similar issue I am getting different result on .trt file generated through TLT and int8 generated through DS using etlt,bin and key file on the same machine 2080 ti and also getting different result on int8(2080ti) and int8(jetson NX xavier). Please help us out where we are wrong. and yes this is not the case with int8 only I am getting same issue with fp16 and fp32 as well.
Thanks.

Please specify details about different result? what kind of result?
Please noted 2080Ti and Jetson NX have different GPU architecture, 2080 Ti is Turing series, while NX is with volta architecture, and NX support FP32/FP16/INT8 operations, while 2080 Ti support FP32/FP16

Thanks for reply @amycao
We have tested on two machine on the video of entry zone of mall and we are trying to classify the gender of person, we found that on one machine it is working good but on another machine it is not classify gender properly, in some cases male is showing female and female is showing male and also result change when we try with different network-mode ( fp16 ,int8 ,fp32 ).

Just to clarify, we are talking about gender classifier, and trying to run it on two GPUs: 2080 Ti (used for training using resnet 50) and Jetson Xavier NX (used for deployment). The results vary on both for all modes (int8,fp16,fp32). None is working accurately. It has better result on trt over 2080Ti.

Further, we have tried retraining the gender classifier using mobilenet_v1 on V100 (on AWS) and tested on both 2080Ti and Jetson Xavier NX. Again the results are not that good on 2080Ti and Xavier NX. Whereas on V100, the results are good on validation dataset.

Your help is appreciated.

Best Regards

Moving this topic into TLT forum.

@thakur.sandeep.srs
Could you try to generate the trt engine with the same tool (tlt-converter) separately in 2080TI and NX?

In 2080Ti, please use the tool directly inside the docker.

! tlt-converter your.etlt -k yourkey -c your_cal.bin -o predictions/Softmax -d 3,244,244 -i nchw -e trt.engine -m 64 -t int8 -b 64

Then config the trt.engine into deepsteam and run.

In NX, please download tlt-converter which is Jetson version. Then run tlt-converter directly in NX.

$ wget https://developer.nvidia.com/tlt-converter-trt71

$ unzip tlt-converter-trt71

$ chmod +x tlt-converter

$ ./tlt-converter your.etlt -k yourkey -c your_cal.bin -o predictions/Softmax -d 3,244,244 -i nchw -e trt.engine -m $64 -t int8 -b 64

Then config this trt.engine into deepsteam and run.

Note, in deepsteam config file, since you already generated trt engine, you need not to set

tlt-encoded-model=age.etlt
tlt-model-key=“abcd1234”

But need to set as below.

model-engine-file = trt.engine

Thanks Morganh for your response.

We have tried tlt-coverter use for various models (int8, fp16 and fp32), but still received the same difference in detections on 2080Ti and Jetson Xavier NX. The training was done on 2080Ti and based pn persons

Later today, considering NX has volta architecture, and as per comments on this thread earlier, we have tried removing blurred images, changed image dimension in training config to be rectangular for person based classifier input images for the two gender target classes, and trained on V100 on AWS. The MAP is .98. Will check the results on real target scenario and share the analysis soon.

To elaborate further, we have done training with mobilenet initially and then Resnet50 today on V100, with cropped person data of PeopleNet for person based gender classifier. We have taken 50% medium quality images from target site and 50% high quality images from third party and internet sources. Because of mask cover for faces, we have not considered face to be the basis for gender classifier, which is usually done in industry.

Please suggest if we are missing anything in configuration or data set. This is our fifth training round and really need your help on best practices to consider for training

Further to clarify, our target platform is Jetson Xavier NX. Would training on 2080Ti or V100 make a difference due to architecture from NX perspective. We are using deepstream 5, Jetpack 4.4, TRT7.1, TLT2, tlt-conveter 7.1.
Please suggest.

Appreciate your help in this regards.

Hi @pooja7923x,
This topic is created by @thakur.sandeep.srs. Can we focus on his original question firstly?
I comment the method above.

With the same etlt model, cal.bin, he generated two versions of trt engine in 2080Ti and NX separately.
Please @thakur.sandeep.srs check with my way.

If not, @pooja7923x, @thakur.sandeep.srs please try to train the default jupyter notebook against KITTI dataset to check if there is the same problem.