Thank you both for the clarification.
It seems the root cause is from TensorRT rather than Deepstream.
Could you share the sample that can output expected (onnxruntime?) and incorrect (TensorRT) results with us?
We want to check this with the TensorRT team.
Hi,
I have managed to add some code to dump out our result data of running efficientdet-d0.onnx by OnnxRuntime to the json file onnxruntime_result.json with the same format as that of trtexec-result.json, which was output by running trtexec with efficientdet-d0.onnx.
And so, the raw data, which is preprocessed (BGR2RGB, scaling,padding,normalization …) for efficientdet-d0.onnx with OnnxRuntime, is also written into a file o4_clip-1_raw_data.bin, I got trtexec-result.json by runing trtexec with --loadInputs=‘data’:‘o4_clip-1_raw_data.bin’, you can also do test with it for investigation.
The final results with bbox(s) drawn on original image are also attached for your reference:
o4_clip-1-OnnxRuntime_result.jpg is the final result parsed from the output of running OnnxRuntime with efficientdet-d0.onnx, and o4_clip-1-TensorRT_result.jpg is the final result parsed from trtexec-result.json, which is the output .json file of running trtexec with efficientdet-d0.onnx. All bboxes are filtered with score_threshold=0.1 and iou_threshold=0.1.
Please note the above efficientdet-d0.onnx was not exported out with the standard efficientdet-d0.pth weight file(which was trained out with coco dataset), it was exported out with a weight file which was trained out with our own dataset, there is only one class: baggage. The original image file o4_clip-1.png is also attached for your reference.
Please ignore the result seen in o4_clip-1-TensorRT_result.jpg attached last time, I forgot to scale bbox back to their actual size with the scale ratio used in image preprocessing for network, so the size and position of the two bboxes in o4_clip-1-TensorRT_result.jpg are not right.
Now I have corrected this error and did more testing and collected the image raw data and the results got by OnnxRutime vs trtexec, you can see, sometime onnx running with trtexec could recognize the same targets, but the score is much smaller than that got by onnx running with OnnxRuntime, and sometime the score for a target was under 0.1 (I set the confidence threshold to 0.1), so no bbox was drawn out (Please see o4_clip-5-TensorRT_result.jpg vs o4_clip-5-OnnxRuntime_result.jpg).
Please use RAR tools to extract the attached zip files as a whole:
No, I think the cause of the precision degradation I saw is totally different, as I got the raw input data for OnnxRuntime’s run() by np.tofile(), the same raw data runs with OnnxRuntime’s run() could get much better result than that runs with trtexec. You can have a test with the raw data which I uploaded in the zip file on 5/Nov, to compare trtexec and OnnxRuntime.
I have a similar issue with efficentnet and efficientdet when converting from onnx to tensorrt. In onnxruntime the predictions are fine but when doing inference with tensorrt I only get a zero tensor
After download all the above 4 files to a same directory, change the name of ''efficientdet-d0-s.z01.zip" to “efficientdet-d0-s.z01”, and do the same changes to “efficientdet-d0-s.z02.zip” and “efficientdet-d0-s.z03.zip”, then unzip efficientdet-d0-s.zip to get efficientdet-d0-s.onnx
Hi,
I have some updates regarding this issue. I managed to deserialize the engine that generated from DeepStream in Tensorrt and I make sure that trt_sample.cpp (8.5 KB) the result is correct and same as the actual model, I am sure now the issue is coming from deepstream preprocessing. I reproduced the issue to understand what is the actual preprocessing in deepstream. I deleted all preprocessing steps in deepstream also tensorrt and I am trying to get the same result.,
attached tenserrt code and config file on deepstream.
Hi,
it seems that the difference because the way that Deepstream read the images. when we compare the pixels value in deepstream before preprocessing it is not same as how opencv read the image, why there is a difference? and how can we get the same exact confidence value of the actual model