Retail Object Detection - Training Help

Hi all,

I hope you are well. I am running a SeedStudio reComputer J3011 (Jetson Orin Nano 8GB) with Jetpack 6.2 Flashed. I have Deepstream 7.1 Installed and I am trying to test the retail_detector_100.onnx model, through my RTSP stream.

It seems that my model is loaded, but it just flashed random bounding boxes around the border of my RTSP stream. My camera is quite far away from some of the products. What would be the best way to train my model on my products from this distance starting from a single shelf?

Would it be necessary to add cameras on each shelf or maybe move this current one closer?

Any advice would be appreciated.

Kind Regards,

PJ Pretorius.

For retail object detection, you can refer to this sample.

This sample integrate retail_object_detection_binary_dino/retail_object_recognition model. you can find

TAO model has some fine tune guidelines here.

Thanks, as you can see:

My RTSP Stream has a big bounding box around the stream, so it’s not picking up objects. I see the engine was built with 416x416 MAX dims, so do you think these products are too small for the model to pick up?

Is there anyone I can chat with or call who might be able to give me better advice on how to train my model or position my cameras?

Thanks,

PJ.

I downloaded the MDX perception application. It seems like a good fit for what I am trying to do.

Is there a way that I can use this with RTSP? For example give it a URI, and then have it output an RTSP stream as well?

I see the current source is:

source:
csv-file-path: sources_retail_object.csv

Any help?

Thanks.

Maybe it’s because the products are too small, or it may be because of the accuracy of the model.

If you need fine tuning, please use the TAO retail_object_detection_binary_dino/retail_object_recognition model mentioned above. We don’t know how to fine tune your model.

Yes, refer to /opt/nvidia/deepstream/deepstream/samples/configs/deepstream-app/source4_1080p_dec_infer-resnet_tracker_sgie_tiled_display_int8.yml, set the sink type value to 4.

Rtsp stream as input is also supported.

Thanks!

I managed to get it up and running, it seems it keeps falling back to my SW encoder rather than the HW encoder.

I am averaging 3FPS, and my RTSP stream is really laggy.

Would you have any suggestions on how I can increase my performance?

There is no hardware encoder in Orin Nano .

I see, thanks for letting me know.

How can I boost my performance? The thing is, I have not even trained this model on any of my products and it’s already suffering to get more than 10fps.

What would I need to do/get in order to run it at a stable performance level?

Thanks,

PJ.

Set the device to maxn mode, adjust interval property value of nvinfer element. use int8 quantized model, or other methods to optimize the model and so son. You can ask related questions in TAO forum
If the performance still cannot match your requirements, you may need a more powerful device

Would it be better if instead of using the live RTSP feed, I rather send myself JSON updates of detections through a dashboard?

This can only reduce the CPU usage for encoding, but cannot reduce the time spent on inference, because nvstreammux/nvinfer/nvvideoconvert use the GPU

I am trying to use transfer learning to add some of my products to the retail_object_detection_binary_dino/retail_object_recognition on NGC.

The problem is that NGC only offers a ‘.pth’ file as a trainable binary. How would I then be able to use TAO to train/transfer-learn the model if the only model available is a PyTorch model?

Thanks.

You can get more help in TAO forums. I don’t know much about it.