Inconsistent result when secondary engine ran through DeepStream SDK

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.3
• TensorRT Version 8.5.3-1
• NVIDIA GPU Driver Version (valid for GPU only) 550.120
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)

Hi everyone,

I am currently running a secondary classifier after primary object detection (YOLOv8) in DeepStream. The secondary classifier was initially trained in PyTorch, then converted to ONNX and subsequently to a .trt file using trtexec.

However, I’ve noticed that the results of the secondary classifier in DeepStream differ from those of the ONNX model. To debug this, I implemented a converter and a JPEG encoder to save the images. Additionally, at the app sink, I am saving metadata -including bounding boxes, detected classes, and probability scores into a JSON file.

To verify the inference consistency, I cropped the bounding boxes from the source image, applied the same preprocessing, and performed inference using both the ONNX model and the .trt file. When comparing results:

  • Less than 0.5% of the images had different detection labels (with low confidence scores for both models).
  • Overall, the probability scores were nearly identical, suggesting that the engine file conversion was accurate.

However, the dumped results from DeepStream differ drastically:

  • Nearly half of the labels are predicted incorrectly.
  • The probability scores in DeepStream are consistently high (close to 1), whereas they were more reasonable in direct inference.

I came across this forum thread, which mentions setting write-output-file=1 in the model config. I have enabled this parameter, but I am unsure where the output data is being saved.

Can anyone guide me on where DeepStream stores the output when write-output-file=1 is set? Additionally, any insights into why my DeepStream inference results are significantly different from direct inference would be greatly appreciated.

Thanks in advance!

model config

[property]
gpu-id=0
model-engine-file=GenderMFK0903.trt
#onnx-file=GenderMFK0903.onnx
labelfile-path=people_attribute/labels.txt
batch-size=16
network-type=1  # classifier
process-mode=2    # Secondary GIE
model-color-format=0
gie-unique-id=2
output-blob-names=gender_output
num-detected-classes=3
operate-on-class-ids=0;1;2;4
infer-dims=3;288;288
net-scale-factor= 0.01735207357279195
offsets=123.675; 116.28; 103.53
is-classifier=1
#input-object-min-width=10
#input-object-min-height=10
#output-tensor-meta=1
maintain-aspect-ratio=0
write-output-file=1

#[class-attrs-all]
#threshold=0.5

Onnx proprocessing

NET_SCALE_FACTOR = 0.01735207357279195
MEAN_VALUES = np.array([123.675, 116.28, 103.53], dtype=np.float32)  # (R, G, B)

def preprocess_image(image):
    """Preprocess the cropped image for ONNX model inference."""
    # Convert BGR to RGB
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    
    # Resize to 224x224
    image_resized = cv2.resize(image_rgb, (288, 288)).astype(np.float32)

    # Apply mean subtraction
    image_normalized = (image_resized - MEAN_VALUES) * NET_SCALE_FACTOR

    # Transpose HWC to CHW (Height, Width, Channels → Channels, Height, Width)
    image_transposed = np.transpose(image_normalized, (2, 0, 1))

    # Add batch dimension
    image_input = np.expand_dims(image_transposed, axis=0)

    return image_input

source code :
onnx_inference.txt (4.7 KB)

preprocessing while inferencing using .trt file

 cv::Mat cropped = image(cv::Rect(x1, y1, x2 - x1, y2 - y1));
                cv::resize(cropped, cropped, cv::Size(288, 288));

                cropped.convertTo(cropped, CV_32F);
                cv::cvtColor(cropped, cropped, cv::COLOR_BGR2RGB);

                // Preprocess
                std::vector<float> input_tensor;
                std::vector<float> mean_values = {123.675, 116.28, 103.53};
                float scale = 0.01735207357;
                for (int c = 0; c < 3; ++c) {
                    for (int i = 0; i < cropped.rows; ++i) {
                        for (int j = 0; j < cropped.cols; ++j) {
                            float pixel = cropped.at<cv::Vec3f>(i,j)[c];
                            input_tensor.push_back((pixel - mean_values[c]) * scale);
                        }
                    }
                }

source code:
trt_inference.txt (5.1 KB)

please let me know if i have to add any more information

You can refer to our Gst-nvinfer Gst Properties, it’s raw-output-file-write.

Could you try to upgrade your DeepStream to our latest version 7.1 and refer to our FAQ to debug for DeepStream Accuracy Issue?

its dumped some .bin files eg (`gstnvdsinfer_uid-01_layer-data_batch-0000000000_batchsize-05.bin’). can you please tell me how to interpret this file or some codes to read this file. I guess this file contains the output of the model. Is there a way i can dump the preprocessed data going into the model?

These are binary data files directly from the output of the model. You can only show it in binary format.

Yes. You can get binary data for any node because this part is open source.

sources\gst-plugins\gst-nvinfer

The source code diagram is like below FAQ.

upgraded to deepstream 7.1

I dumped the the preprocessed data into binary files, files of pattern
gstnvdsinfer_uid-02_layer-input_batch-0000000519_batchsize-01.bin and
gstnvdsinfer_uid-02_layer-gender_output_batch-0000000000_batchsize-01.bin

I reconstructed the preprocessed image using the following code

with open(file_path, "rb") as f:
            data = np.fromfile(f, dtype=np.float32)

        # Reshape back to CHW format (assuming (3, 288, 288))
        image_chw = data.reshape(3, 288, 288)

        # Reverse preprocessing
        image_hwc = np.transpose(image_chw, (1, 2, 0))  # CHW → HWC
        image_denormalized = (image_hwc / NET_SCALE_FACTOR) + MEAN_VALUES  # Undo normalization
        image_bgr = cv2.cvtColor(image_denormalized.astype(np.uint8), cv2.COLOR_RGB2BGR)  # Convert RGB → BGR

        # Save the image
        cv2.imwrite(output_image_path, image_bgr)

I could regenerate images eg:

However, some images are distorted :

I saved each frames from the source video and confirmed that there is no color filter or distortion happening on the image.

I also fed the dumped preprocessed data into an onnx model and trt model (running outside deepstream). The results vary from the results form the model. I am attaching the code that I used to infer the image outside deepstream.

onnx_inference_on_raw_bin.txt (1.7 KB)
trt_inference_bin.txt (3.5 KB)

Its a three class classification model and i computed the Euclidean distance between the probability score and the distribution is as follows.



what I am trying to say is that the same engine file running inside and outside deepstream is giving me different results.

please let me know if i have to add any more information
Thanks in advance!

This may be caused by asynchronous processing when you dump the data. The real data in the pipeline will not show similar problems.

Let’s troubleshoot this inconsistency issue directly. Can you attach your whole three pipelines separately above?

code : custom_deepstream.zip (47.2 MB)

source video :


model :
yolov8n_coco80_71.zip (36.4 MB)

Hi @daredeviles888 , the code you provided is too complicated. Let’s simplify that.

  1. Get just one image from the video with different inference results
  2. Provide how to run the 3 pipelines with the image
  3. Let’s just use this image to analyze the difference results between the three methods of reasoning

I have simplified the deepstream pipeline code to classify on single image, along with that I am attaching few images and python code to classify the same image using onnx model.
sgie_debug.zip (29.8 MB) I have mentioned the instructions to run the codes in the Readme file.

feeding in the image ‘frame_7383_1.jpg’ to deepstream, classID at the first index gets the highest probability (0.51904296875). The same image when fed into the onnx model classID at the zeroth index gets the highest probability (0.94159544)

Could you try to modify your code like below to check if that can increase the accuracy?

1.Please set the width and height to the width and height of the picture below.

    streammux.set_property("width", <width of the image>)
    streammux.set_property("height", <height of the image>)

2.Please set the scaling-filter=1 in the nvinfer config file.

changing the height and width of the the streammux to the resolution of the input image has improved the accuracy; predicting the correct label. But still there is a significant difference in probability when running in deepstream and onnx.
deepstream : label–> 0 , probability—> 0.84716796875
onnx-python : label–> 0 , probability—> 0.94159544

also, i want to deploy this secondary classifier after a primary detection, how can I set the streamux height and width in this scenario?

Can you post the original probabilities of deepstream, now it’s probability—> 0.84716796875? From this improvement, the biggest factor affecting this probability should be the processing of video. Since you used pytorch to process video data during the training, the probability of reasoning after processing video data with pytorch is higher.

Set the width and height to the width and height of your image.

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks