Problem running Efficientnet B0 as SGIE in Deepstream

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.3
• JetPack Version (valid for Jetson only)
• TensorRT Version 8.5.3
• NVIDIA GPU Driver Version (valid for GPU only) 535.183.01
• Issue Type( questions, new requirements, bugs) questions
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) using deepstream_app example code
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description) sgie

Hi, I am having an issue running EfficientNet B0 on SGIE in DeepStream. The bounding box and label are drawn on the output video, but I am unable to get the classifier labels. I have double-checked my labels.txt for the PGIE, and it seems to be correct. The SGIE is filtering on the class ID.

I have tested the exported ONNX file using Python, and it seems to work.

I trained the model using TAO 5.5.

My TAO configuration.

results_dir: "/workspace/tao-experiments/results/efficientnetb0/"
dataset:
  train_dataset_path: "/workspace/tao-experiments/data/climbing/train/"
  val_dataset_path: "/workspace/tao-experiments/data/climbing/val/"
  preprocess_mode: 'torch'
  num_classes: 2
  augmentation:
    enable_random_crop: True
    enable_center_crop: True
    enable_color_augmentation: True
    disable_horizontal_flip: False
    mixup_alpha: 0.4
train:
  qat: False
  checkpoint: '/workspace/tao-experiments/pretrained/efficientnet-b0-tf2/'
  batch_size_per_gpu: 32
  num_epochs: 80
  optim_config:
    optimizer: 'sgd'
    momentum: 0.9
  lr_config:
    scheduler: 'cosine'
    learning_rate: 0.05
    soft_start: 0.05
  reg_config:
    type: 'L2'
    scope: ['conv2d', 'dense']
    weight_decay: 0.00005
model:
  backbone: 'efficientnet-b0'
  input_width: 256
  input_height: 256
  input_channels: 3
  input_image_depth: 8

I exported to onnx using the following command.

tao model classification_tf2 export -e /workspace/tao-experiments/specs/climbing/efficientnetb0/specs/spec.yaml export.checkpoint=/workspace/tao-experiments/results/efficientnetb0/train/efficientnet-b0_062.tlt export.onnx_file=/workspace/tao-experiments/results/classifier_nonaugment.onnx

The main config for the deepstream_app is

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=1

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI 4=RTSP
type=3
uri=file:///app/data/climbing.mp4
num-sources=1
gpu-id=0
cudadec-memtype=0


[streammux]
live-source=0
batch-size=1
batched-push-timeout=40000
width=1280
height=720

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=/app/config/model_yolov8n_onnx.txt

[secondary-gie0]
enable=1
gpu-id=0
gie-unique-id=2
operate-on-gie-id=1
nvbuf-memory-type=0
config-file=/app/config/config_sgie_person_efficientnetb0.txt

[tiled-display]
enable=1
rows=1
columns=1
width=1280
height=720

[osd]
enable=1
border-width=2
text-size=15
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
display-text=1

[sink0]
enable=1
type=3
#1=mp4 2=mkv
container=1
#1=h264 2=h265 3=mpeg4
codec=1
#encoder type 0=Hardware 1=Software
enc-type=0
sync=0
bitrate=2000000
#H264 Profile - 0=Baseline 2=Main 4=High
#H265 Profile - 0=Main 1=Main10
profile=0
output-file=/app/output/out_yolo_nvdcf.mp4
source-id=0

[tests]
file-loop=0

[tracker]
enable=1
tracker-width=640
tracker-height=384
gpu-id=0
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvmultiobjecttracker.so
ll-config-file=/app/models/tracker_config.yml

My pgie config is.

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
model-color-format=0
#model-engine-file=/app/models/yolov8n_warehouse/ppe.engine
#onnx-file=/app/models/yolov8n_warehouse/yolov8n.onnx
#labelfile-path=/app/models/yolov8n_warehouse/labels.txt

model-engine-file=/app/models/yolov8n_warehouse_v1-0-0/yolov8n.engine
onnx-file=/app/models/yolov8n_warehouse_v1-0-0/yolov8n.onnx
labelfile-path=/app/models/yolov8n_warehouse_v1-0-0/labels.txt

batch-size=1
network-mode=2
num-detected-classes=7
force-implicit-batch-dim=0
interval=2
gie-unique-id=1
process-mode=1
network-type=0
cluster-mode=2
maintain-aspect-ratio=1
symmetric-padding=1
parse-bbox-func-name=NvDsInferParseYolo
custom-lib-path=/opt/nvidia/deepstream/deepstream/sources/DeepStream-Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet

[class-attrs-all]
nms-iou-threshold=0.45
pre-cluster-threshold=0.25
topk=300
detected-min-w=20
detected-min-h=20

My sgie config is

[property]
net-scale-factor=0.017507
offsets=123.675;116.280;103.53
model-color-format=0
onnx-file=/app/models/classifier/climbing/classifier_nonaugment.onnx
model-engine-file=/app/models/classifier/climbing/classifier_nonaugment.engine
labelfile-path=/app/models/classifier/climbing/labels.txt
infer-dims=3;256;256
batch-size=1
network-mode=0
network-type=1
process-mode=2
num-detected-classes=2
output-blob-names=Identity:0
classifier-threshold=0.1
gie-unique-id=2
operate-on-class-ids=0
output-tensor-meta=1
interval=0
classifier-async-mode=1

python code to test onnx.

import onnxruntime as ort
import numpy as np
import cv2
import os
import shutil

# Path to the ONNX model
MODEL_PATH = "climbing_models/classifier_nonaugment.onnx"

# Class labels
LABELS = ["not_climbing", "climbing"]

# Input image dimensions
INPUT_WIDTH = 256
INPUT_HEIGHT = 256

# Normalization parameters (PyTorch preprocessing)
MEAN = [0.485, 0.456, 0.406]
STD = [0.229, 0.224, 0.225]

# Output directories for classification results
OUTPUT_DIR = "results"
CLIMBING_DIR = os.path.join(OUTPUT_DIR, "climbing")
NOT_CLIMBING_DIR = os.path.join(OUTPUT_DIR, "not_climbing")

# Ensure output directories exist
os.makedirs(CLIMBING_DIR, exist_ok=True)
os.makedirs(NOT_CLIMBING_DIR, exist_ok=True)

def preprocess_image(image_path):
    """
    Preprocess the input image to match the model's input requirements.
    """
    # Load the image using OpenCV
    image = cv2.imread(image_path)
    if image is None:
        raise ValueError(f"Image not found: {image_path}")
    
    # Resize to the model's expected input size
    image = cv2.resize(image, (INPUT_WIDTH, INPUT_HEIGHT))
    
    # Convert to RGB
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    
    # Normalize the image (scale to [0, 1] and apply mean/std normalization)
    image = image.astype(np.float32) / 255.0
    image -= MEAN
    image /= STD
    
    # Transpose to CHW format (channel-first) for the model
    image = np.transpose(image, (2, 0, 1))
    
    # Add batch dimension
    image = np.expand_dims(image, axis=0)
    
    return image

def run_inference(image_path):
    """
    Run inference on the input image using the ONNX model.
    """
    # Load the ONNX model
    session = ort.InferenceSession(MODEL_PATH)
    
    # Get model input and output names
    input_name = session.get_inputs()[0].name
    output_name = session.get_outputs()[0].name
    
    # Preprocess the input image
    input_image = preprocess_image(image_path)
    
    # Run inference
    outputs = session.run([output_name], {input_name: input_image})
    
    # Get the predicted class (assuming single output)
    predicted_class = np.argmax(outputs[0], axis=1)[0]
    confidence = outputs[0][0][predicted_class]
    
    return LABELS[predicted_class], confidence

def classify_and_copy_images(image_dir):
    """
    Classify images in the input directory and copy them to the appropriate output folder.
    """
    for image_file in os.listdir(image_dir):
        image_path = os.path.join(image_dir, image_file)
        if not image_file.lower().endswith((".jpg", ".jpeg", ".png")):
            continue
        
        try:
            label, confidence = run_inference(image_path)
            print(f"Image: {image_file} | Predicted: {label} | Confidence: {confidence:.2f}")
            
            # Determine the output folder based on the label
            if label == "climbing":
                shutil.copy(image_path, os.path.join(CLIMBING_DIR, image_file))
            elif label == "not_climbing":
                shutil.copy(image_path, os.path.join(NOT_CLIMBING_DIR, image_file))
        except Exception as e:
            print(f"Error processing {image_file}: {e}")

if __name__ == "__main__":
    # Path to the directory containing images
    IMAGE_DIR = "test_data"
    
    # Classify and copy images
    classify_and_copy_images(IMAGE_DIR)

gst-nvinfer does not support per channel factor. You may need to implement by yourself.

What is the class 0 of your PGIE(yolov8n model)? What will your SGIE (efficientb0 model) classify?

We don’t know anything about the models you used in your app. No suggest can be given.

Take our /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-test2 as the example, there are 3 models used in this pipeline. The PGIE is the trafficcamnet model which output 4 classes: car - class 0, bicycle - class 1, person - class 2, road_sign - class 3. The two SGIEs are vehicle type classifier and vehicle maker classifier who classify the car’s maker and the car’s color. So in the SGIE configurations there is “operate-on-class-ids=0” setting because the class 0 is car for the PGIE model.

And you also need to make sure the /app/models/classifier/climbing/labels.txt file is written in right format. The class labels should be put in one line and be separated by semicolons, please refer to /opt/nvidia/deepstream/deepstream/samples/models/Secondary_VehicleMake/labels.txt

The yolo model output 7 classes and we tend to use only pgie for our application but now we are adding sgie for certain conditions. Class 0 is person and the sgie is classifiying whether the person is climbing or not climbing. I extracted the person from the video file using the same yolo model used in deepstream (file:///app/data/climbing.mp4), and ran the onnx classifier with the python script and get the following result.

Image: frame_275.jpg | Predicted: not_climbing | Confidence: 0.96
Image: frame_274.jpg | Predicted: not_climbing | Confidence: 0.96
Image: frame_281.jpg | Predicted: not_climbing | Confidence: 0.82
Image: frame_277.jpg | Predicted: not_climbing | Confidence: 0.93
Image: frame_264.jpg | Predicted: not_climbing | Confidence: 0.96
Image: frame_345.jpg | Predicted: climbing | Confidence: 0.96
Image: frame_348.jpg | Predicted: climbing | Confidence: 0.81
Image: frame_442.jpg | Predicted: climbing | Confidence: 0.95
Image: frame_278.jpg | Predicted: climbing | Confidence: 0.90
Image: frame_272.jpg | Predicted: not_climbing | Confidence: 0.94
Image: frame_266.jpg | Predicted: not_climbing | Confidence: 0.95
Image: frame_343.jpg | Predicted: climbing | Confidence: 0.94
Image: frame_250.jpg | Predicted: climbing | Confidence: 0.96
Image: frame_267.jpg | Predicted: climbing | Confidence: 0.93
Image: frame_271.jpg | Predicted: not_climbing | Confidence: 0.94
Image: frame_259.jpg | Predicted: not_climbing | Confidence: 0.93
Image: frame_256.jpg | Predicted: not_climbing | Confidence: 0.80
Image: frame_257.jpg | Predicted: not_climbing | Confidence: 0.89
Image: frame_270.jpg | Predicted: climbing | Confidence: 0.93
Image: frame_282.jpg | Predicted: not_climbing | Confidence: 0.85
Image: frame_518.jpg | Predicted: climbing | Confidence: 0.76
Image: frame_258.jpg | Predicted: not_climbing | Confidence: 0.95
Image: frame_261.jpg | Predicted: not_climbing | Confidence: 0.96
Image: frame_269.jpg | Predicted: not_climbing | Confidence: 0.95
Image: frame_276.jpg | Predicted: not_climbing | Confidence: 0.96
Image: frame_516.jpg | Predicted: climbing | Confidence: 0.84
Image: frame_521.jpg | Predicted: climbing | Confidence: 0.85
Image: frame_520.jpg | Predicted: climbing | Confidence: 0.90
Image: frame_344.jpg | Predicted: climbing | Confidence: 0.94
Image: frame_260.jpg | Predicted: not_climbing | Confidence: 0.95
Image: frame_512.jpg | Predicted: climbing | Confidence: 0.96
Image: frame_273.jpg | Predicted: not_climbing | Confidence: 0.95
Image: frame_517.jpg | Predicted: climbing | Confidence: 0.87
Image: frame_440.jpg | Predicted: climbing | Confidence: 0.95
Image: frame_263.jpg | Predicted: not_climbing | Confidence: 0.96
Image: frame_279.jpg | Predicted: not_climbing | Confidence: 0.92
Image: frame_262.jpg | Predicted: not_climbing | Confidence: 0.94
Image: frame_341.jpg | Predicted: climbing | Confidence: 0.98
Image: frame_265.jpg | Predicted: not_climbing | Confidence: 0.94
Image: frame_248.jpg | Predicted: climbing | Confidence: 0.96
Image: frame_519.jpg | Predicted: climbing | Confidence: 0.84
Image: frame_255.jpg | Predicted: not_climbing | Confidence: 0.76
Image: frame_268.jpg | Predicted: not_climbing | Confidence: 0.95
Image: frame_342.jpg | Predicted: climbing | Confidence: 0.98
Image: frame_280.jpg | Predicted: not_climbing | Confidence: 0.92

From Deploying to DeepStream for Classification TF1/TF2/PyTorch - NVIDIA Docs, the tao spec file with preprocessing_mode: “torch”, the net-scale-factor, offsets and model color format should be the following values based on the documentation.

net-scale-factor=0.017507
offsets=123.675;116.280;103.53
model-color-format=0

I modified my labels.txt with semicolon.

not_climbing;climbing

But using the same video file, all person in the video has been classified as climbing. Am I doing something wrong as the classifier output is always climbing versus the python onnx example which vary between climbing and not climbing. Should I increase the threshold? 0.5 doesn’t work as all person is classified as climbing. Maybe I got label the wrong way round (climbing;not_climbing) but I expect it should be showing both classification in the output video similar to the onnx result.

Can you provide the onnx model you are using?

classifier_nonaugment.zip (14.2 MB)
Attached is the onnx file.

Please do not enable classifier-async-mode for such kind of classifier.

still the same result. person classified as climbing

[property]
net-scale-factor=0.017507
offsets=123.675;116.280;103.53
model-color-format=0
onnx-file=/app/models/classifier/climbing/classifier_nonaugment.onnx
model-engine-file=/app/models/classifier/climbing/classifier_nonaugment.engine
labelfile-path=/app/models/classifier/climbing/labels.txt
infer-dims=3;256;256
batch-size=1
network-mode=0
network-type=1
process-mode=2
num-detected-classes=2
output-blob-names=Identity:0
classifier-threshold=0.5
gie-unique-id=2
operate-on-class-ids=0
output-tensor-meta=1
interval=0
classifier-async-mode=0

Do you think I should train with

dataset:
  augmentation:
    enable_center_crop: False
    enable_random_crop: False

Same as Fine-tuned TAO ClassificationTF2 Accuracy Drop after Compiling to TensorRT

Didn’t work. Classified climbing all the time when a person appears in the video. Going to training a single class and rely on the classifier threshold. Plus training a resnet model as deepstream_test2 works with resnet.

Do you have any other ideas what could be wrong with the above as ideally the onnx model seems to work well outside of deepstream?

For training related questions, please raise topic in TAO toolkit forum. Latest Intelligent Video Analytics/TAO Toolkit topics - NVIDIA Developer Forums

yes but the original onnx model works fine when using onnxruntime. I am wondering why it is not working on deepstream. Should I be using deepstream or create my own framework? Is there a better way resolve this issue as I have to deliver this to a client? If I can’t get it to work in deepstream, I would eventually have to port all my code from deepstream to custom C++ video pipeline.

Do you have an example of efficientnet b0 working on deepstream?

I tried the model and the nvinfer configuration you provided with deepstream-app, the “non-climbing” and “climbing” can be classified.

1 Like

Wow. Thanks Fiona. Is that using the similar sgie configuration or can you provide it to me?

I just copy the configurations here.

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.