• Jetson NX
• Classification Resnet18
• TLT Version 3.0
• Training spec file:
I trained a classification model using tlt to be used as a secondary model in deepstream. I followed the documentation and get 97% accuracy on my test set. I then used tlt-converter to make an engine file (for nx) and when I run this model I only get about 20% accuracy (basically I think the model is just guessing). Also, interestingly, my confidences are 0.8 and above with the .tlt but they drop to around 0.4 with the engine.
For the training data I cropped out the objects of interest from the images and trained the model on the cropouts. I did not resize or pad the images as the dataloader should take care of this?
I think that the problem is that when the engine is used as a secondary classifier the images it sees are very different from the ones the model was trained on. To debug this, can you give me more details on:
- I am training with preprocess mode “caffe”, what does this do? Do I need to set anything special in deepstream to emulate it? Does it work in RGB or BGR? Would this make a difference?
- there is a parameter called “offsets” = Array of mean values of color components to be subtracted from each pixel in deepstream. How do I find out what the mean subtraction values should be? Does this postprocessing step of the cropouts impact what the images at training should look like? What are the default values for mean subtraction at training time?
- in what order are the cropouts postprocessed by deepstream by the pgie? crop → mean subtraction → normalization → biliear resize → padding (only bottom right)? What colour is the padding? Is the color consistent in tlt training dataloader and in deepstream?