I’m still struggerling with deepstream using python to try out different models. I have some questins that I can’t find the answes to in the deepstream documentation
• Hardware Platform (Jetson / GPU)
Jetson agx xavier
• DeepStream Version
• JetPack Version (valid for Jetson only)
**• Issue Type
I’m trying out different models like "facial landmarks"n(NVIDIA NGC)
I’m not able to fill in the configuration file correctly I need to define the input and output names of the network. But I can’t see it’s documented on the page for the mode. How do I find the input/output names for any etlt model since this is a common problem. I’ve seen your recommendations on looking at other configuration files but this does not actually answer the question. In this case, there is no configuration file.
The facial landmark model can output different outputs 68, 80 or 104. How do I define to the which output I would like? Is this set in the configuration file somehow?
When looking in in the deepstream examples for eg secondray classifiers like car-make or car-color. I can’t find which function that is converting the tensor output from the model into the metadata structure that is passed to the pad/sinks. When is this done automatically and when do I need to create my own converter? If I eg use my own boundingbox-model how can I re-use the functions you are using? Do I need to write my own converter function for the facial landmark model or is it done magically as the other examples? How do I scale the coordinates to the image?
The image is normalized, resized and converted to a tensor to input to the primary detector. But for the second classified, Is the image already normalized, that’s why the scaling factor is set to 1 in the examples for eg car-color?
I do not understand how to set the network-type 0: detector, 1 classifier etc…
It’s not a detector since it does not produce bounding boxes and it’s not a classifier either. does 0 mean that the output is a regression problem and if using 1 it’s converting one-hot encoded outpus to classes? What exaclty is this switch doing?
For the primary classifier/Detecotr it’s common to define the input shape of the input tensor (infer-dims=3;160;160) but I can’t find this switch for the examples I’ve fond for the secondary classifier. Does the network detect then input-shapes automatically or when is this switch needed? Is the processing able to both up and down-size an image?
I would like to save the metadata to a file in json format and have been looking at the “gst-nvmsgconv” plugin(Gst-nvmsgconv — DeepStream 5.1 Release documentation) . But I can’t understand how to use this to convert it to json. I was expectin that the “payload-type” property would allow me to convert it to eg json since there seems to be different formats. But this property seem to tell how much information that is stored in message. Another question that pops up is that eg “PAYLOAD_DEEPSTREAM” is one setting. but it’s not defined which number that corresponds. i’m guessing it’s 0 or 1, why is this not defined?
Regarding the gst-nvmsgconv, where is the final result stored so I can access it using a pad and save it to file?