Running the inference and evaluation on .engine model

Hi, I’m working on an object detection pipeline on a Jetson board. I successfully converted the ONNX model and integrated it into DeepStream pipeline using nvinfer.

To avoid replicating the full video processing pipeline for evaluation, which includes camera input, encoding/decoding, inference etc., I’d like to directly evaluate the .engine models in Python.

Specifically, I want to:

  1. Load the .engine model.
  2. Run inference on a set of images after preprocessing.
  3. Save the inference results to disk.

This would allow me then to compute some metrics, compare different converted models with varying precisions and analyse their accuracy trade-offs. I haven’t been able to find a complete example that demonstrates such workflow, so I’m looking for help.

• Hardware Platform (Jetson / GPU): Jetson
• DeepStream Version: 1.2.0
• TensorRT Version: 10.3.0
• NVIDIA GPU Driver Version (valid for GPU only): CUDA 12.6

  1. Build a pipeline with nvstreammux/nvinfer/fakesink, and add a probe function at src pad of nvinfer.
  2. Add dump-input-tensor/overwrite-input-tensor/ip-tensor-file in the configuration file of nvinfer to ensure that the tensor input for every data type engine inference is consistent.
    You can consider using dump-input-tensor/dump-output-tensor to dump the fp32 engine input as a baseline.
    Refer to /opt/nvidia/deepstream/deepstream/sources/libs/nvdsinfer/nvdsinfer_context_impl.cpp for how dump/load tensor.
  3. Get the output results of different data type engine (fp32/fp16/int8) in the probe function.
  4. Finally, compare the output results.
1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.