DeepStream - Loading Custom Model

Hello Again All,

So, running into similar issues trying to feed different models into the deepstream configuration file. That is, using the gender detection caffemodel from the site below consistently yields “Segmentation fault (core dumped)” as perhaps it is accessing memory it cannot or has not access to (still learning). Anyone experience this or by chance gotten a custom model to take in the nvgstiva-app sample?

https://github.com/GilLevi/AgeGenderDeepLearning

Though, more-than-likely my lack of knowledge here, as I only replaced the below “old commented lines” with new model and prototxt files. Any thoughts or notes by chance? Any direction appreciated. Cheers.

model-file=file:///home/nvidia/deepstream/configs/age-gender-test/gender_net.caffemodel
proto-file=file:///home/nvidia/deepstream/configs/age-gender-test/deploy_age.prototxt
mean-file=file:///home/nvidia/deepstream/configs/age-gender-test/mean.binaryproto
#model-file=file:///home/nvidia/Model/ResNet_18/ResNet_18_threeClass_VGA_pruned.caffemodel
#proto-file=file:///home/nvidia/Model/ResNet_18/ResNet_18_threeClass_VGA_deploy_pruned.prototxt
#model-cache=file:///home/nvidia/Model/ResNet_18/ResNet_18_threeClass_VGA_pruned.caffemodel_b2_fp16.cache

Hi jfcarp,
I think we should support mean-file, our QA is helping to reproduce this issue, will get back once having update.

Thanks
wayne zhu

Ahhhh… I see. Sorry, noticed the configuration item in the web docs. Appreciate the info and time taken, thank you very much! Please let me know if there is any way I can assist, would be happy to learn :)

Cheers,
James

Question for you. Is there a reason the project isn’t located on github or open source for others to contribute and assist with development?

That is, what is NVIDIA’s thoughts on the sample app and its future?

Thanks in advance.

Hi Jfcarp,
For gstreamer omx plugin, I think we already public some plugin’s code:
you can find:
L4T Libgstomx for gstreamer 1-0 sources in https://developer.nvidia.com/embedded/downloads#?search=gstreamer
For nvinfer plugin, we have plan to public, but there are some network’s license issue now.

For APP side, we have already public part of code, nvgst-iva-app is already include in package, but nvgst-iva-app-ui, we have not public it yet.

Thanks
wayne zhu

Hello,

So, I am returning to this thread with a similar question, as well as, an update to the mean-file reproduction above.

  • Is mean-file expected to be supported in the upcoming v2.0 releases?

Additionally, we have been trying to get any custom model to work in deep stream without any luck.

Several examples of our experiences thus far:

  • using ped-100 or multiped-500 from the Jetson-inference tutorial, the models are loaded, but no bounding boxes returned
  • our own custom model which uses a detectnet prototxt similar to the ped/multiped models, was finally able to load without segmentation fault, but still yields no detections.

What is it we might be missing?

The only differences between these models and ResNet that we can find it perhaps the size the network was trained on, but I’m not sure this is a factor, as the images should be resized based on the network input right?

What else can we look at to achieve nvOSD to overlay bboxes? We have matched the blob and bbox layers with what is output on our model, as well as, looked at the pixel normalization factor, etc… but still no luck. Per our data scientist, it seems the outputs of the ResNet_18 model are the same as the detectnet prototxt we are using. He explained it as a grid output and both are the same.

Which leads us to the parse-func. We tried the google net, Nvidia type1/2, and resnet, but no luck. Could this be the culprit that is not allowing nvOSD to overlay bboxes? And if so, how do these parse-func work and is there any direction that can be provided if we need to create our own library for this?

Thank you in advance,
James.

I add more detail information on James’ question.

the story started from using nvidia digits training single class detectnet using coco subset (the subset is created by filtering all classes but person, in other words, it is subset only including person, the other classes is reassigned to be dontcare).

as we know, detectnet is to modify googlenet for object detection ( https://devblogs.nvidia.com/detectnet-deep-neural-network-object-detection-digits/ )

  1. training run 5 days, mAP is about .26
  2. we tested the trained model in coco subset inside digits, work fine, and see persons are detected.
  3. after changing deploy.prototxt (mainly changing height and width), and we use example.py to test our images (https://gist.github.com/lukeyeager/777087991419d98700054cade2f755e6), work fine, we see persons are detected by the trained model.
  4. next, we decide to run the trained model by tensorrt. in order to run by tensorrt, the last python layer in deploy.prototxt must be removed.

layer {
name: “cluster”
type: “Python”
bottom: “coverage”
bottom: “bboxes”
top: “bbox-list”
python_param {
module: “caffe.layers.detectnet.clustering”
layer: “ClusterDetections”
param_str: “1920, 1072, 16, 0.4, 2, 0.02, 22, 1”
}
}

tried parsefunc = googlenet, nvidia type 1 or 2, always report segmentation fault. tried parsefunc = Resnet, there is no segmentation fault. however, there is no bounding box detected. selecting parsefunc=Resnet does not make any sense, since subnet of detectnet is googlenet. I do not know what are nvidia type 1 or 2. but I guess it is related to object detection. I guess parsefunc = googlenet is about classification.

interestingly, we tried Resnet in tensorrt, there is no segmentation fault, we see bounding box detected.

after removing the above python layer, detectnet and Resnet is similar. the difference is, the subnet of detectnet is googlenet, yet the subnet of Resnet is feature extraction part of Resnet. I read deploy.prototxt and compare output layers ( coverage and bounding boxes) of Resnet and detectnet, they are the same. we run both of Resnet and detectnet (removing the last layer) by example.py (https://gist.github.com/lukeyeager/777087991419d98700054cade2f755e6), the outputs (coverage and bboxes ) are grid-based:

for detectnet (single class), the shape of coverage is batchsize x 1 x (height/stride) x (width/stride) and the shape of bboxes is batchsize x 4 x (height/stride) x (width/stride)

for Resnet (3 classes), the shape of coverage is batchsize x 3 x (height/stride) x (width/stride) and the shape of bboxes is batchsize x 12 x (height/stride) x (width/stride)

the results are comparable. In other words, their difference is subnet used in feature extraction.

The last layer (python) is to convert grid-based coverage and bboxes to the detected bounding boxes list. I guess tensorrt internalizing it.

This is the whole story.

The weired thing is why there is segmentation fault in detectnet, but not in Resnet. your response is appreciated in advance

Hi jfcarp,
As I know mean-file is supported now.
Doesn’t it work on your side?

What parse-func do is translate your network’s output into BBOX_meta(NV defined this structure). You need write this function for your network, and pass result to BBOX_meta.
This meta data will be passed in the pipeline once it is filled, finally output to overlaysink.

For debug, you can add a probe in overlay sink’s source pad, to check detection’s output.
You can refer to following function In APP:
static void
print_metadata (GstBuffer * src_buf, GstElement * elem, gchar * pad)
{
IvaMeta *meta;
gpointer state = NULL;
BBOX_Params *params;
g_mutex_lock (&print_lock);

g_print ("______ Metadata for %s :%s______\n", GST_ELEMENT_NAME (elem), pad);

while ((meta = (IvaMeta *) gst_buffer_iterate_meta (src_buf, &state))) {
if (!meta) {
continue;
}

params = meta->meta_data;
if (!params) {
  continue;
}

g_print ("Has metadata: num_rects=%d\n", params->num_rects);

}
g_print ("-------------------------------\n");
g_mutex_unlock (&print_lock);
}

static GstPadProbeReturn
bbox_debug_probe (GstPad * pad, GstPadProbeInfo * info, gpointer u_data)
{
GstBuffer *buffer = NULL;

if (info->type & GST_PAD_PROBE_TYPE_BUFFER) {
buffer = (GstBuffer *) info->data;
print_metadata (buffer, gst_pad_get_parent_element (pad),
gst_pad_get_name (pad));
}

return GST_PAD_PROBE_OK;
}

Thanks
wayne zhu

Hi jfcarp,

Is it possible to share your model?
So I can have a try on my side.

Thanks
wayne zhu

Hello,

Thank you for the direction, much appreciated. I will have to take some time to dig into this a bit, but we are using DetectNet, so if you can get ped or multi-ped running from the Jetson-Inference tutorial, it should be the same as our model.

Also, will test the mean-file again when I get a chance, hoping to get ped-100 working first :)

Thanks again, will get back once I have more to go on, cheers.

Could you check if it is possible to load VGG based model, please?