Slow video streaming while using pytorch with cuda

moraviamoti · December 10, 2019, 8:21am

Hello,
I’m using Jetson Xavier and I have created python multiprocess application for video analytics, containing the following 2 processes (Entirely Separate with no inter process communication):

Simple process for capture video source by openCV and showing it with openCV imshow .
Process that taking constant tenzor and run it through a pytorch nn model with cuda in an infinite loop.
(I started with 2 connected processes for capture the video and then process the frames concurrent but to debug the problem I broke the connection and ran it on constant tensor)

The problem is that when only the video process is running the video is smooth. But when the NN process works concurrently the video become not smooth, that, even though the fps for the video process is 25 like I designed it by waitkey.
I suspect that I’m using all my gpu resources for the NN and then the frame show rendering is on hold for a few milliseconds until the gpu is free for render. If it’s indeed the reason for my problem, can I determine priority for gpu usage between the two processes? Do you have another idea how to solve it?

This is the pseudo code (the original is offline and I can’t upload it):
Process1:

Video = cv2.capture (video_source)
While True
	Frame = Video.get()
	Cv2.imshow(frame)
	Waitkey(40)
	# Check fps

Process2:

model = # loading the pytorch NN model
tenzor = # creating zeros pytorch cuda tenzor
while True
	model(tenzor)

Attaching tegrastats from running time

Thanks
moti

DaneLLL · December 10, 2019, 9:25am

Hi,
Please flash Xavier with maxn config and try again:
https://devtalk.nvidia.com/default/topic/1049168/

For checking system loading, you can utilize tegrastats

moraviamoti · December 10, 2019, 11:05am

Hi,
I tried it and saw no difference,
Do you have other suggestions?

thnx

AastaLLL · December 23, 2019, 9:18am

HI,

Your GPU utilization is 99% (GR3D_FREQ).

When inferencing NN with GPU, it takes almost all the GPU resource and limits the performance of streaming.
Moreover, it looks like you keep pushing NN jobs(while loop) into GPU, which make the resource occupied all the time.

Actually, it is a little bit tricky to handle difference GPU tasks at the same time.
Here are two suggestions for you:

1. Try our DeepstreamSDK.
We optimize the pipeline from capture → inference → display, and the CUDA task only be triggered when needed.
It will help the resource occupied problem of your usecase.

2. Try to run your DNN on DLA, which can offload the GPU for display or video usage.

Thanks.

Topic		Replies	Views
Sometimes processing slows down and the screen freezes Jetson Xavier NX jetson-inference	4	880	March 9, 2022
Jetson Xavier NX can not support multi video stream well by using deepstream DeepStream SDK gstreamer	5	1028	October 2, 2021
My program slows down over time Jetson AGX Xavier gstreamer	9	1216	October 23, 2023
Force rendering on CPU on Jetson Xavier Jetson AGX Xavier	6	900	October 18, 2021
Tensor.cuda() low fps Jetson Xavier NX tensorrt , fps	4	563	June 21, 2023
GPU Processing takes longer when run less frequent Jetson Nano gpu-computing	4	102	July 17, 2024
Jetson nano sometimes extremely slow with GPU Jetson Nano cuda , pytorch	7	970	November 3, 2023
OpenCV-Cuda functions running far slower than expected on Jetson Xavier NX Jetson Xavier NX opencv , cuda	7	1546	August 18, 2022
Jetson Xavier NX running with Cuda and TF1.5 Jetson Xavier NX tensorflow	6	1441	October 18, 2021
100% utilization of GPU Jetson Xavier NX gpu	2	648	October 18, 2021

Slow video streaming while using pytorch with cuda

Related topics