Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU): GeForce 4090
• DeepStream Version: 6.2
• TensorRT Version: 8.5
• NVIDIA GPU Driver Version (valid for GPU only): 525
• Issue Type( questions, new requirements, bugs): questions
I noticed that the yolo4 config from here, the engine is built in int8 percision and they include
layer-device-precision to specify that some layers should use float32 instead. Does the
layer-device-precision property only affect the inference or it also changes how the engine is built?
It affects the engine, since the engine will always be generated for inference on DeepStream.
Thank you for the reply.
I want to build the engine in advance instead of waiting until the first time I run a deepstream app to build the engine. How do I generate the engine that incorporate
layer-device-precision property that Deepstream supports if I use an external tool to build the engine such as
Don’t know about
tao-converter, but you can use the DeepStream app to generate the engine. The first time you run the app, you will see the message “Trying to create engine from model files.” DeepStream will attempt to generate the engine and save it in a subfolder, for example:
/opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_fp16.engine. If you run the app with sudo privileges, the generation will succeed, and the engine will be saved with the message “Serialized CUDA engine to file: /opt/nvidia/deepstream/deepstream-5.0/samples/models/Primary_Detector/resnet10.caffemodel_b1_gpu0_fp16.engine successfully.” You can edit the config file to use that engine file by setting the path in the
model-engine-file field. The next time you run the app, the engine file won’t be generated but rather loaded from the file.
Thank you for the clarification.
I’m aware that I can make Deepstream re-use the engine by changing the config, I did what you described from time to time but that’s very inconvenient when I work with multiple models, say 10 models.
tao-converter is from the NVIDIA TAO Toolkit, they allow me to build the engine in advance and Deepstream can use that engine directly. My goal is that the first time I run a Deepstream app, all the engines it needs are already built so I don’t have to wait for an hour, then go back and change all config files so that Deepstream won’t build new engines the next time I run the same app.
layer-device-precision property is added in Deepstream version 6.1.1 and the document doesn’t specify if Deepstream uses that property during inference or during engine building. If Deepstream uses it during engine building, then I need to find a way to replicate the setting in
tao-converter. If I can’t do that, then I will have to use Deepstream for engine building with mixed precision.