As I wanted to parse this model output my way I used output-tensor-meta, I also used write-output-file to dump the raw output of my model to gain more insight.
Note : write-output-file also appears to dump the input that my model takes.
I proceeded as follows:
1 - Run DeepStream on a video sample (no good results obtained)
2 - Use the dumped input as input of my engine and run it through TensorRT (good results obtained)
3 - Use the dumped output and directly parse it (no good results obtained)
Note : To be sure of what result I should get I made my model overfit on the video sample.
At first I thought the cause was my post inference parsing, but it seems the problem is already present when the raw output is dumped. However step 2 shows the engine correctly infer from the same input when outside of DeepStream.
Any ideas as to what might cause this peculiar behavior?
Before trying to plug the secondary classifier in, I only used the primary detector and always got good results.
The input dumped by write-output-file is the input of my secondary classifier, which is the output of my primary detector. I am even able to recreate a correct image from this dumped input reversing the net-scale-factor and offsets operations.
Yes I do. As I said I can obtain a correct image from the dumped data of input layer by reversing the pre-process operation.
The fact that my engine is working when using TensorRT only, shows that the dumped input (which is the state of my data at the input layer) is already pre-process, as when I try to infer with TensorRT on a croped input of my own I need to pre-process it to get correct results.
First regarding the input, the bbox are the same. When you talk about Input YUV data you must be talking about the raw input and the color-format it comes in, right ? If so, yes it is the same data which is in RGB.
Yes for your point 3, both output are different. They both got the same dimension, no object is missing, but the value inside are different, which yields different label once post processing is done.
Regarding your fifth point I didn’t use any IPlugin layer in this model, I’ve only specified the location of the onnx model and let deepstream do the conversion.
Given the facts that the input and pre-process are the same, and that the engine itself isn’t faulty, as it yields good result with TensorRT, yet the raw output are somehow different when running through deepstream. I was inclined to think that I might have missed a critical parameter in the config file.
Hi mchi, no I don’t think so, as you can see in my config file, both input-object-min-width and input-object-min-height are set to 0, so all detected objects go through my secondary model.
Hi ChrisDing, well I ran multiple test using TensorRT only. When I use the input dumped by write-output-file as input I don’t run any pre-processing ( as it’s already done because I have to reverse the pre-processing to obtain a normal image).
When I used a normal cropped image as input, I run the exact same pre-process as specified in the config file :
(x - offsets) * (net-scale-factor)
with :
net-scale-factor=0.0039215697906911373
offsets=123.675;116.128;103.53
In both case the resulting predictions done by TensorRT are both correct.