ROS_deep_learning detectNet Node question

jiaxingg · April 6, 2020, 3:27am

Hi,

I’m using Jetson to design a ROS deep learning system. Currently, I created a ROS node to acquire image(cv_bridge) and publish on a topic, then let the detectNet node subscribe the topic and then process the image to get bouning box information. But my question is why the processing is taking so long compared with the demo, which I got a 30FPS (Real-Time Object Detection in 10 Lines of Python Code on Jetson Nano - YouTube) ? Is that something I did wrong for the video format (current is 320x480) or other thing?

Please advise, thank you so much

dusty_nv · April 6, 2020, 5:05pm

Hi @jiaxingg, what performance are you getting from ROS? In general, ROS will have higher overhead due to CPU transport of messages, whereas the detectnet-camera demo uses zero-copy between the camera and GPU processing.

Also, you might want to try running jetson_clocks to see if that helps.

jiaxingg · April 6, 2020, 6:05pm

Hi @dusty_nv

Thank you so much for your apply. And I want to say your demo is fantastic, I learned a lot from you!

Right now I got from ROS is “converting 640x480 bgr8 image” after I run the detectnet node. And after several minutes, I can get one detection result from the console.

I’m totally agree with you about the latency due to CPU transporting the messages, so is there any tutorial or demo that I learn to use CUDA or GPU to acquire and transport image to the ROS Core instead of CPU? Will it make the transport faster?

Thank you so much.

dusty_nv · April 6, 2020, 8:00pm

If you run it again, it should load much faster - the first time, it needs to optimize the detection model with TensorRT. After that, the TensorRT engine is cached to disk.

ROS 1 doesn’t support GPU transport between nodes. Alternatively, you could try creating nodelets for detectNet and your data source(s). I once made a nodelet for imageNet here: https://github.com/dusty-nv/ros_deep_learning/blob/master/src/nodelet_imagenet.cpp

Also, let me ask you - is your performance lower when using your camera node than it is with the test publisher image_publisher?

jiaxingg · April 7, 2020, 1:51am

Hi @dusty_nv,

I run it again, it didn’t faster than last time. And you mentioned about the performance compare with the image_publisher, I think image_publisher is way more faster than my method.

I’ll try to send video from usb_cam through image_publisher and try the nodelet method.

Another question is, I want to train the detectNet to detect USPS/UPS/FedEx truck (the project is a AI home security camera), and I’ve download some data to build the training and testing dataset. Do you have some suggestion for the labeling software I can use?

Thank you again for your help!

dusty_nv · April 7, 2020, 3:04pm

Can you tell if the delay in loading is due to detectNet, or is it something else?

I haven’t personally tried, but you can find some recommendations here: https://www.quora.com/What-is-the-best-image-labeling-tool-for-object-detection

jiaxingg · April 7, 2020, 8:11pm

I think is due to send the image message through the CPU and the ROS node instead of the nodelet. I’m working on create a ROS node based on OpenCv but communicate with detectNet node via nodelet, hope it works.

Thank you so much for the suggestion and help, I’ll go check it.

Thank you!

jiaxingg · April 12, 2020, 4:13am

Hi@dusty_nv

I sloved the delay issue by created nodelet method like you said. And also I use CSI camera instead of usb camera, I think this helps a little bit as well.

Now I have a question about training the SSD mobile v2 with my own dataset. I successfully create my own dataset, however the instruction on the repo is about googlenet, so is there any docs that I can follow to train the SSD with DIGITS?

Thank you so much for your help!

dusty_nv · April 13, 2020, 3:56pm

Hi @jiaxingg, glad you hear you were able to solve the delay issue using nodelets.

To train SSD-Mobilenet-v2, it isn’t done with DIGITS - see here for links about it: DIGITS or somthing else - #7 by dusty_nv