Dim size in .prototext files

What are the dim values representing in the .prototext files. For example:

/opt/nvidia/deepstream/deepstream/samples/models/Primary_Detector_Nano/resent10.prototext has dim values of 480x272.


/opt/nvidia/deepstream/deepstream/samples/models/Primary_Detector/resent10.prototext has dim values of 640x368

The performance section of the documentation suggests we lower these values to 480x272 on the jetson Xavier and NX. Why? does that make inference quicker if you specify the input dimensions to be smaller? At the expense of accuracy?

What would happen if we changed these values to higher one? Shouldn’t these dims match the size of the images the models were trained on?

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) Nano, NX
• DeepStream Version 5.0GA

Hey Jason
The model is a fully convolutional network, it can support inferencing at image dimensions different than that of the dimensions it was trained on. So technically any input dimension greater than 16x16 (wxh) can work with the model.

Thanks @bcao - so why were the defaults of 480x272 and 640x368 chosen? For performance is it true that you make these smaller and in general if you make them bigger it will provide better accuracy?

It’s the input size of model for Nano, you can check the prototxt in Nano platform, anyway you can choose more smaller size.

Yeah, you are right. It will consume more compute capability for using big input image .

So how did you settle on 480x272 - just from testing/tuning and finding that it was a good for performance and accuracy?

Yes, we had test the model on 480x272, but we don’t have experience on more smaller size, but you can do that per your requirements to find a balance.

1 Like