I noticed that the yolo4 config from here, the engine is built in int8 percision and they include layer-device-precision to specify that some layers should use float32 instead. Does the layer-device-precision property only affect the inference or it also changes how the engine is built?
I want to build the engine in advance instead of waiting until the first time I run a deepstream app to build the engine. How do I generate the engine that incorporate layer-device-precision property that Deepstream supports if I use an external tool to build the engine such as tao-deploy or tao-converter?
Don’t know about tao-deploy or tao-converter, but you can use the DeepStream app to generate the engine. The first time you run the app, you will see the message “Trying to create engine from model files.” DeepStream will attempt to generate the engine and save it in a subfolder, for example: /opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_fp16.engine. If you run the app with sudo privileges, the generation will succeed, and the engine will be saved with the message “Serialized CUDA engine to file: /opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_fp16.engine successfully.” You can edit the config file to use that engine file by setting the path in the model-engine-file field. The next time you run the app, the engine file won’t be generated but rather loaded from the file.
I’m aware that I can make Deepstream re-use the engine by changing the config, I did what you described from time to time but that’s very inconvenient when I work with multiple models, say 10 models. tao-deploy and tao-converter is from the NVIDIA TAO Toolkit, they allow me to build the engine in advance and Deepstream can use that engine directly. My goal is that the first time I run a Deepstream app, all the engines it needs are already built so I don’t have to wait for an hour, then go back and change all config files so that Deepstream won’t build new engines the next time I run the same app.
The layer-device-precision property is added in Deepstream version 6.1.1 and the document doesn’t specify if Deepstream uses that property during inference or during engine building. If Deepstream uses it during engine building, then I need to find a way to replicate the setting in tao-deploy / tao-converter. If I can’t do that, then I will have to use Deepstream for engine building with mixed precision.