Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) GPU
• DeepStream Version 6.3
• TensorRT Version 8.5.3-1
• NVIDIA GPU Driver Version (valid for GPU only) 550.120
• Issue Type( questions, new requirements, bugs)
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
Hi everyone,
I am currently running a secondary classifier after primary object detection (YOLOv8) in DeepStream. The secondary classifier was initially trained in PyTorch, then converted to ONNX and subsequently to a .trt
file using trtexec.
However, I’ve noticed that the results of the secondary classifier in DeepStream differ from those of the ONNX model. To debug this, I implemented a converter and a JPEG encoder to save the images. Additionally, at the app sink, I am saving metadata -including bounding boxes, detected classes, and probability scores into a JSON file.
To verify the inference consistency, I cropped the bounding boxes from the source image, applied the same preprocessing, and performed inference using both the ONNX model and the .trt
file. When comparing results:
- Less than 0.5% of the images had different detection labels (with low confidence scores for both models).
- Overall, the probability scores were nearly identical, suggesting that the engine file conversion was accurate.
However, the dumped results from DeepStream differ drastically:
- Nearly half of the labels are predicted incorrectly.
- The probability scores in DeepStream are consistently high (close to 1), whereas they were more reasonable in direct inference.
I came across this forum thread, which mentions setting write-output-file=1
in the model config. I have enabled this parameter, but I am unsure where the output data is being saved.
Can anyone guide me on where DeepStream stores the output when write-output-file=1
is set? Additionally, any insights into why my DeepStream inference results are significantly different from direct inference would be greatly appreciated.
Thanks in advance!
model config
[property]
gpu-id=0
model-engine-file=GenderMFK0903.trt
#onnx-file=GenderMFK0903.onnx
labelfile-path=people_attribute/labels.txt
batch-size=16
network-type=1 # classifier
process-mode=2 # Secondary GIE
model-color-format=0
gie-unique-id=2
output-blob-names=gender_output
num-detected-classes=3
operate-on-class-ids=0;1;2;4
infer-dims=3;288;288
net-scale-factor= 0.01735207357279195
offsets=123.675; 116.28; 103.53
is-classifier=1
#input-object-min-width=10
#input-object-min-height=10
#output-tensor-meta=1
maintain-aspect-ratio=0
write-output-file=1
#[class-attrs-all]
#threshold=0.5
Onnx proprocessing
NET_SCALE_FACTOR = 0.01735207357279195
MEAN_VALUES = np.array([123.675, 116.28, 103.53], dtype=np.float32) # (R, G, B)
def preprocess_image(image):
"""Preprocess the cropped image for ONNX model inference."""
# Convert BGR to RGB
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# Resize to 224x224
image_resized = cv2.resize(image_rgb, (288, 288)).astype(np.float32)
# Apply mean subtraction
image_normalized = (image_resized - MEAN_VALUES) * NET_SCALE_FACTOR
# Transpose HWC to CHW (Height, Width, Channels → Channels, Height, Width)
image_transposed = np.transpose(image_normalized, (2, 0, 1))
# Add batch dimension
image_input = np.expand_dims(image_transposed, axis=0)
return image_input
source code :
onnx_inference.txt (4.7 KB)
preprocessing while inferencing using .trt file
cv::Mat cropped = image(cv::Rect(x1, y1, x2 - x1, y2 - y1));
cv::resize(cropped, cropped, cv::Size(288, 288));
cropped.convertTo(cropped, CV_32F);
cv::cvtColor(cropped, cropped, cv::COLOR_BGR2RGB);
// Preprocess
std::vector<float> input_tensor;
std::vector<float> mean_values = {123.675, 116.28, 103.53};
float scale = 0.01735207357;
for (int c = 0; c < 3; ++c) {
for (int i = 0; i < cropped.rows; ++i) {
for (int j = 0; j < cropped.cols; ++j) {
float pixel = cropped.at<cv::Vec3f>(i,j)[c];
input_tensor.push_back((pixel - mean_values[c]) * scale);
}
}
}
source code:
trt_inference.txt (5.1 KB)
please let me know if i have to add any more information