SGIE Classifier output inconsistent in deepstream app

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)

  • Nvidia Jetson Xavier AGX
    • DeepStream Version
  • 6.3
    • JetPack Version (valid for Jetson only)
  • Version: 5.1.2-b104
    • TensorRT Version
  • 8.5.2-1+cuda11.4
    • Issue Type( questions, new requirements, bugs)
  • Bug, Question
    • How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
  • Files and description provided in post
    • Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

SGIE Classifier output inconsistent in deepstream app

Background

I have a classification model, ResNet34, that I am running as an SGIE on the crops output from the PGIE detection model.
I have a modified version of deepstream-test2, where the changes are that it can ingest .mp4 videos and it prints out the classification results per frame.
deepstream-infer-tensor-meta-test was referenced for the probe which is used to print out the SGIE outputs.

Model does not work without output-tensor-meta: 1

When using this classifier in the deepstream-test2 app I found that I need to set output-tensor-meta: 1 in order to get any output from the classifier.
When using the sample classifier model, I am able to see the output of the classifer in classifier_meta_list, however classifier_meta_list is not populated when using my model.
When I set output-tensor-meta: 1, I am able to extract classes from the tensor meta.

I have also noticed, that when attempting to use my classifier as the PGIE, it does not output obj_meta_list or frame_user_meta_list

Question 1: Why does my model not output results that can be accessed in the default manor?

I have uploaded my model. It is resnet34, which should be very similar to the sample resnet18. Why doe smy model not work as expected?
What options do I have to diagnose this?

The model does not output expected values.

The output of the model running in this pipeline is different than expected. A classification is successfully extracted for each object, and looking at the raw class probabilities from float probability = outputCoverageBuffer the values look fine, but when comparing the output with the ground truth or pre-trt model, the outputs are very different.

The outputs of the pytorch model, onnx model and the ground truth all match pretty closely – within ~30%.
However, the output of the same model converted to a TRT engine and run in this deepstream app results in drastically different results.

I have also used TRTexec to directly run the TRT engine on the frames from the video, and this somehow results in different values than when running the model in the deepstream_test2 app.

Question 2: Why does my model output different values than expected?

Why does the model produce different values than expected once converted to a trt engine?
I also have a regression model that has the same architecture (resnet34) except for the last layer, and the results of this model are consistent across pytorch<->onnx<->deepstream. The conversion process for this model is the same as for the classifier, but the classifier is not functioning correctly.

Resources

I have attached in this zip modified-test2.zip (36.4 KB):

  • The modified deepstream-test2 app
  • The configs I am using for the the deepstream-test2 app

And the onnx model is available here: Dropbox

And I run the app with GST_DEBUG=3 ./deepstream-test2-app modified/dstest2_config.yml

It may take some time to investigate your customized case. Will be back when there is any progress.

I don’t find your label file for the SGIE model. There is a sample for classifier in NVIDIA-AI-IOT/deepstream_tao_apps at release/tao4.0_ds6.3ga (github.com)

Your customized SGIE postprocessing is wrong. I don’t understand why did you use the unique_id as the class id.

NvDsLabelInfo *label_info =
              nvds_acquire_label_info_meta_from_pool(batch_meta);
          label_info->result_class_id = attr.attributeValue;
          g_print(" %d\n", attr.attributeValue);

          label_info->result_prob = attr.attributeConfidence;

          /* Fill label name */
          switch (meta->unique_id)
          {
          case 2: // sgie1_unique_id:
            strcpy(label_info->result_label,
                   sgie1_classes_str[label_info->result_class_id]);
            break;
          case 3: // sgie2_unique_id:
            strcpy(label_info->result_label,
                   sgie2_classes_str[label_info->result_class_id]);
            break;
          case 4: // sgie3_unique_id:
            strcpy(label_info->result_label,
                   sgie3_classes_str[label_info->result_class_id]);
            break;
          default:
            break;
          }

Please read the source code and our samples.

Hi, thank you for your quick response.

I don’t find your label file for the SGIE model
The label file is not needed to replicate these issues. We are currently printing out the class index and using that for evaluation.

Your customized SGIE postprocessing is wrong. I don’t understand why did you use the unique_id as the class id.

The class id is being used. The switch statement is to choose which SGIE labels to map the result_class_id, so meta->unique_id is used only to select which classifier labels to map to.
The section that makes use of the output class is:sgie1_classes_str[label_info->result_class_id]); This sgie_pad_buffer_probe() code is taken directly from the deepstream-infer-tensor-meta-test source code, which was provided with deepstream.

The only code added in that section is g_print(" %d\n", attr.attributeValue);, which I’m using to see the classification results per frame. This is assigned slightly higher up in the function, where the output layer of the classifier is iterated over and the highest probability index is assigned.

        for (unsigned int c = 0; c < numClasses; c++)
        {
          float probability = outputCoverageBuffer[c];
          // g_print("\tclass %d probability: %f\n", c, outputCoverageBuffer[c]);
          if (probability > maxProbability)
          {
            maxProbability = probability;
            attrFound = true;
            attr.attributeIndex = 0;
            attr.attributeValue = c;
            attr.attributeConfidence = probability;
          }
        }

This code seems to work to extract classes from both the sample model and our model, but the issue is that the output from our model when doing this do not match the expected performance.
So while the classes that are output seem fine, the rate at which it misclassifies the input is drastically different than when running the onnx model for example.

I have also run check_model.py: onnx.checker.check_model(model), which does not show any issues.

You only configured one SGIE in your code, no need to switch between the different SGIEs.

When you enabled the “output-tensor-meta” , please refer to /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-infer-tensor-meta-test for where to probe the postprocessing functions for PGIE and SGIEs.

Hi, I would appreciate it if you could give my post a more thorough read.
I specifically mention that the code you are referencing is from /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-infer-tensor-meta-test in both of my previous replies.

I realize that there is only one sgie configured, but this does not relate to the two questions this post is about.
The questions relate to the one sgie that is configured.

Please refer to /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-infer-tensor-meta-test for the multiple SGIEs. The only difference is the gie-unique-id.

Hi, thank you.
That has been refered to. I am successfully able to run the code I uploaded here using the sample models and inputs.
As mentioned, this code is a direct combination of /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-infer-tensor-meta-test
and /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test2

The issue is not that there are multiple SGIES, the issue is that the SGIE being used with this code does not output the anticipated values for the provided input.

It is known that when this pipeline is configured to use 1 sgie there will only be output from that 1 sgie. The question is why is the output from that 1 sgie incorrect, where the sample models provide the correct output.

Can you provide complete projects with the PGIE yolo model onnx and label file and the test video so that we can run the same case as you? Your classification_model.onnx output layer dimension is 1x5 float32. Suppose they are confidences for classes, there are only 5 classes. But your modified deepstream-test2 code uses 12 labels(sgie1_classes_str) . Can you tell us what the classification_model.onnx classify?

Hi,
The PGIE, labels and input are proprietary. I am currently working with my company and the NVIDIA Inception program to how to best get you what you need. For the uploaded sample code, I am only making use of the numerical class IDs, not the label mapping.

While I am looking into getting you the rest of the data you need, do you have any insight into the other issue this post is about?

Model does not work without output-tensor-meta: 1

I described the issue in detail in the initial post, but for some reason the uploaded classifier model has not been able to work without setting output-tensor-meta: 1. This has been tested on /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test2, which does not access the classifier output through the tensor-meta.
It has been confirmed that the provided sample classifier model, which is of a similar architecture, can work with this method.

Do you have any suggestions as to why this could be occurring?

I need the complete code/configurations/models to check what is the reason of the issues you mentioned.

I have asked for the label file at the post SGIE Classifier output inconsistent in deepstream app - #4 by Fiona.Chen, and I need your PGIE model and configurations too to reproduce your issue.

Can you provide the PGIE model and configurations so that we can debug with your real case?

We don’t know anything about your models. E.G. If your PGIE model can detect cars, persons and trees, and your SGIE can classify tree types, your configurations are important. If the tree class id is 2 from PGIE, the tree classifier SGIE configuration “operate-on-class-ids” should be set the value 2.

From the source code, configurations and SGIE onnx you posted here, we can’t tell whether they are right or not since we don’t know anything about the models.