[Question] Clarifications needed on exporting trained YoloV3 model

hyperlight · October 5, 2020, 6:17pm

I just need some confirmation that I understand the doc correctly.

For FP32/FP16/INT8 deployment, I have 2 options: using .etlt or .engine file. .engine is hardware specific so it need to be generated on the deployment hardware (i.e., a Jetson), this .engine can either be generated using tlt-converter (for Jetson) or Deepstream will generate a .engine file on the fly using the provided .etlt file. My question is: is there any difference in the model’s performance (speed & accuracy( between using tlt-converter vs. letting Deepstream generate a .engine file on the fly?
For INT8 deployment, regarding the calibration cache file. What is its purpose? What is the general advice on creating this file (e.g., the number of images to use, using train or val set or both)?
I’m aware that the number of images to be used for calibration need the be at least batch_size * batches, what happens if the number of images exceed that value? How should I set batch_size and batches to get optimal performance from the model?
Regarding max_batch_size when create the .engine file using tlt-converter, in any Deepstream model config file, there is a batch-size (=number of source elements in the pipeline) parameter, what is the relationship between these two parameters? My understanding is batch-size <= max_batch_size, is that correct?
When using tlt-converter, the input_dimensions need to match the inferred input_dimensions used during tlt-export. When deploy with Deepstream, in the model config file, there is a parameter called uff-input-dims, this value need to match the input_dimensions, is that correct?

Morganh · October 6, 2020, 2:42am

It should be the same.
In order to use int8 mode, we must calibrate the model to run 8-bit inferences.So, need to generate int8 calibration table. Ideally, it is best to use atleast 10-20% of the training data to calibrate the model. The more data provided during calibration, the closer int8 inferences are to fp32 inferences.
In tlt-export command, there are two related options, "–batch_size " and “–batches”. The number of images = batch_size * batches. If it is larger than images quantitiy in “cal_image_dir”, it will prompt error.
It is not related.
For uff-input-dims=<c;h;w;0> Where c = number of channels, h = height of the model input, w = width of model input, 0: implies CHW format

hyperlight · October 6, 2020, 3:44pm

Hi @Morganh, thank you for the quick reply, just a follow-up:

Does that mean I will get the best INT8 performance (better speed with comparable accuracy) if I use all available images in both train and val? For the recommended amount (10-20%), how should I pick them?

Morganh · October 6, 2020, 3:47pm

If pick all images, the closer accuracy are to fp32.