I have trained FasterRCNN model and I want to compare it(based on custom test dataset) to Resnet-10 model from the DeepStream reference application. I successfully started Resnet-10 model with OpenCV but I don’t know how to parse its output. I tried several formats but they are not working for this net.
Are there some docs about the model output format and how to parse it ?
Hi
I tried deepstream app and it’s working fine. I am trying to run the Resnet10 model separately from DeepStream. I found that output layers from this network are “conv2d_bbox;conv2d_cov/Sigmoid” (according to the config_infer_primary.txt). I want to extract bounding boxes and probabilities from them. But I cannot understand their structure.
Is this information is documented somewhere ?
Or there is a way to use this config outside of deepstream ?
I already looked in the docs and deepstream-app code several times. What I found from there is that we pass the network and it’s configuration to nvinfer plugin. The plugin evaluate the net, parse the results and fill Metadata structure which is the ouput from the plugin.
I am interested in ‘how nvinfer evaluates the ResNet10 model?’ and ‘how it parse the output?’. With this knowledge I will be able to evaluate the model outside of deepstream-app and I can compare results to ours.
From the resnet10.prototxt file we can see that network accept input with shape (batchsize=1, channels=3, height=368, width=640) and the net has 2 output layers. First one is with shape (batchsize=1,4,23,40) and the second one - (batchsize=1,16,23,40). First one probably contains some probabilities/confidences, the second one - bboxes. I don’t know how to parse this 2 output matrices and can’t find any information related to that.
I am new in this area and maybe I am missing something
Instead of trying to evaluate the resnet10 caffe model, I tried different approach. I created very simple gstreamer plugin that extracts information from output metadata of the nvinfer plugin.
In Deepstream 4.0 some of the source code of nvinfer plugin came with the SDK - but it’s not much because it’s is using other closed lib at the end.
If I use the FasterRCNN Caffe model with training for custom classes, instead of resnet10 which the default SKD uses, then how much of a performance improvement can I expect w.r.t the mapping score?
There will be accuracy downgrades if switch to a 4 layer model of the RESNET, but is there a way to profile the output from the pipeline streams?
The resnet10.prototxt files are different in your repository https://github.com/AastaNV/DeepStream and what comes with Deepstream 4.0, I guess the model is upgraded.
My question is that is the parser still valid that is given in your repository for the Deepstream 4.0 resnet10 model.
Actually I want to make python parser for the resnet10 model.