Convert .pth to .onnx on xavier, int8 calibration

mdegans · March 17, 2020, 10:16pm

I made a redaction app using DeepStream and am trying to follow this tutorial to optimize the tutorial’s pre-trained model for deployment in my app, instead of the fp32 one am using currently (from this old redaction repo). I have a couple questions:

What is the best way to convert the .pth file to .onnx on Xavier, since the retinanet-examples Python package won’t install? Can I do this on a pair of 1080s instead?
Same as above, to perform calibration for the Xavier, should I do the int8 calibration on the Xavier itself or on x86 Nvidia?
My app accepts n number of sources and sets the batch-size at runtime to match. When converting to a engine with tensorrt-utils’s onnx_to_tensorrt.py, should I use the --explicit-batch option to generate an optimization profile? Or do I need to do somethign else and have DeepStream do the job for me?

SunilJB · March 18, 2020, 4:25am

Moving to Jetson Xavier forum so that Jetson team can take a look.

AastaLLL · March 18, 2020, 7:34am

Hi,

1. You can convert the model into onnx with pyTorch frameworks directly:
https://pytorch.org/docs/stable/onnx.html

To convert a model into onnx is platform independent.
However, you will need to apply the conversion from onnx into TensorRT engine on the device directly.

2. Please apply the INT8 calibration on the device.

3. YES. Please use it if the input sources number is fixed.

Thanks.

mdegans · March 18, 2020, 3:11pm

Many thanks, AastaLLL.

Thanks, if it’s platform independent I will just run it on x86 Nvidia with the tutorial’s script and tear it apart later to figure out what it is doing.
Ok. Will do since the tensorrt-utils scripts do run on device.
I am completely new to all this model stuff, so forgve my follow-ups:

I understand that dynamic batch size means implicit batch size means -1 in the batch dimension. Is that all correct?
Does a .pth file necessarily have an explicit batch dimension? I know nothing about PyTorch other than the file is in python pickle format.
If I want to add and remove sources at runtime (I do), should I generate an onnx with a dynamic batch size model or no? What are the benefits and downsides of each?

AastaLLL · March 23, 2020, 8:46am

Hi,

You can find more information in this document:

1. Dynamic batch = explicit batch = set it as -1

2. No needed

3. You will need to specify the range of dimension when building.
This is because the TensorRT choose inference algorithm in the building stage.
The tensor size change is limited due to the pre-allocated memory.

Thanks.

mdegans · March 23, 2020, 3:33pm

Thanks for asking my questions! I think I can figure out the rest from the documentation and source.

AastaLLL · March 24, 2020, 5:08am

Feel free to file a topic if you need help. : )

mdegans · March 24, 2020, 5:22am

Thanks, Aasta :)

I am taking one of the “Optimization and Deployment of TensorFlow Models with TensorRT” GTC course tomorrow, but if anything isn’t covered I will certainly post a topic.

Edit: I am very silly. The class is Tuesday April 7. Gives me a head start, i guess.