Increase FPS of Jetson-inference using complete utilisation of CPU-GPU

Pratosha · April 6, 2018, 9:23am

Hi ,
We are running Jetson-inference on NVIDIA JETSON TX2 board using TensorRT .
We have modified the code to meet certain requirements lying lane overlays, CAN transmission etc.
When there is no pedestrian for detection the algorithm runs at 15fps.
When it detects an object it slows down to 3-4fps.
Is there a way to compile the code or make changes such that functions can be split to run between CPU and GPU?
Or
Is there any method to increase the FPS even when objects are detected?

Thanks,
Pratosha

dusty_nv · April 6, 2018, 2:43pm

Hi Pratosha, I haven’t really seen this behavior with detectNet, in my experience it runs the same performance regardless of how many objects are detected on-screen. How big are the objects? You may want to try disabling the rendering by editing detectnet-camera.cpp.

The clustering is performed on CPU, here is the code: [url]jetson-inference/detectNet.cpp at e12e6e64365fed83e255800382e593bf7e1b1b1a · dusty-nv/jetson-inference · GitHub

You may want to profile it or try disabling it to see if that is causing your performance issue.

Pratosha · April 9, 2018, 4:21am

Hi dusty_nv,

On commenting the lines from 383 to 410 in detectnet.cpp - The frame rate remains 15fps constantly , no detection or overlays were seen.
On changing the profile mode from TRUE to FALSE in tensornet.cpp and building the fps dropped down to 11-12 for no objects in the frame.
The size of the objects detected is that of a person - depends on where the person is standing(can be multiple/single).

We have added a lot of lines in CudaOverlay.cu for drawing lane lines and boundaries. That is the reason for reduction in fps.

Please help.

Thanks,
Pratosha

dusty_nv · April 10, 2018, 12:36pm

Hi Pratosha, if you plan to have a display attached, you may wish to investigate having OpenGL do the rendering of the lines/ect. since you require additional visualization and that is impacting performance. The reason for cudaOverlay.cu is so that it could still do some basic visualization even with no display attached (i.e. in a headless robot). The rasterization implemented in cudaOverlay is not optimized and meant for simplicity.

Pratosha · April 11, 2018, 10:10am

Hi Dusty,

Thanks a lot for your response
“you may wish to investigate having OpenGL do the rendering of the lines/ect.”
-if cudaOverlay.cu is meant for simplicity Can you suggest where else can the code be written for overlays/lines? OR Which cpp file do we have to make changes to get the rendering of lines?

Another observation we made was - running jetson inference detectnet-camera (without making any changes) in the presence of an object gives 10fps . When no object is detected on the screen its 15.
How can we increase the fps on jetson tx2 for jetson-inference ?
I have attached the screenshot for your reference. Imgur: The magic of the Internet

Kindly help

Thanks,
Pratosha

dusty_nv · April 11, 2018, 2:55pm

You would want to add additional OpenGL rendering code after this line: https://github.com/dusty-nv/jetson-inference/blob/e12e6e64365fed83e255800382e593bf7e1b1b1a/detectnet-camera/detectnet-camera.cpp#L259

This is where the OpenGL texture is rendered after the CUDA<->OpenGL interopability is complete, after this point in the code you would want to render your more complex overlay.

Have you tried running “sudo ~/jetson_clocks.sh” or “sudo nvpmodel -m 0” ?

Pratosha · April 12, 2018, 5:12am

Hi dusty,

Yes the comment given by you did work.
We are adding our lines there . Thank you :)

"Have you tried running “sudo ~/jetson_clocks.sh” or “sudo nvpmodel -m 0” ? " - Yes , we pass the 2 commands on the terminal before running detectnet-camera

Thanks.

Topic		Replies	Views
Detectnet Performance on Jetson TX2 Jetson TX2	3	1841	October 18, 2021
Object Detection Performance Jetson Tx2 slower than expected Jetson TX2	22	14799	October 18, 2021
Increasing fps on Jetson TX2 for a Tensorflow algorithm Jetson TX2	7	2579	October 18, 2021
Jetson TX2 framerate with face detection and 5 point facial landmark Jetson TX2	2	2468	October 18, 2021
Slow inference using tensorrt sampleFasterRCNN, 320ms/frame Jetson TX2	5	1445	October 18, 2021
Performance of Tensorflow (1.5) on Jetson TX2 slower than expected Jetson TX2	3	2814	October 18, 2021
TX2: caffe model that runs slower on nvcaffe GPU than on OpenCV CPU Jetson TX2	3	664	October 18, 2021
Object Detection working very slow on Jetson TX2 Jetson TX2	9	1743	October 18, 2021
Object Detection models and FPS Jetson TX2	3	1084	September 18, 2019
Yolo object detection speed on a Jetson TX2 Jetson TX2	5	1464	October 18, 2021

Increase FPS of Jetson-inference using complete utilisation of CPU-GPU

Related topics