Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU): GeForce 4090
• DeepStream Version: 6.2
• TensorRT Version: 8.5
• NVIDIA GPU Driver Version (valid for GPU only): 525
• Issue Type( questions, new requirements, bugs): questions
I noticed that the yolo4 config from here, the engine is built in int8 percision and they include layer-device-precision
to specify that some layers should use float32 instead. Does the layer-device-precision
property only affect the inference or it also changes how the engine is built?
It affects the engine, since the engine will always be generated for inference on DeepStream.
@miguel.taylor
Thank you for the reply.
I want to build the engine in advance instead of waiting until the first time I run a deepstream app to build the engine. How do I generate the engine that incorporate layer-device-precision
property that Deepstream supports if I use an external tool to build the engine such as tao-deploy
or tao-converter
?
Don’t know about tao-deploy
or tao-converter
, but you can use the DeepStream app to generate the engine. The first time you run the app, you will see the message “Trying to create engine from model files.” DeepStream will attempt to generate the engine and save it in a subfolder, for example: /opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_fp16.engine
. If you run the app with sudo privileges, the generation will succeed, and the engine will be saved with the message “Serialized CUDA engine to file: /opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_fp16.engine successfully.” You can edit the config file to use that engine file by setting the path in the model-engine-file
field. The next time you run the app, the engine file won’t be generated but rather loaded from the file.
@miguel.taylor
Thank you for the clarification.
I’m aware that I can make Deepstream re-use the engine by changing the config, I did what you described from time to time but that’s very inconvenient when I work with multiple models, say 10 models. tao-deploy
and tao-converter
is from the NVIDIA TAO Toolkit, they allow me to build the engine in advance and Deepstream can use that engine directly. My goal is that the first time I run a Deepstream app, all the engines it needs are already built so I don’t have to wait for an hour, then go back and change all config files so that Deepstream won’t build new engines the next time I run the same app.
The layer-device-precision
property is added in Deepstream version 6.1.1 and the document doesn’t specify if Deepstream uses that property during inference or during engine building. If Deepstream uses it during engine building, then I need to find a way to replicate the setting in tao-deploy
/ tao-converter
. If I can’t do that, then I will have to use Deepstream for engine building with mixed precision.