Having Cuda Kernels Share Time Slices


I’m using 2 cuda kernels one acts as an image filter and the other occasional provides data to the image filter. I have noticed that the image filter stops until the data filter completes. This make playback stop until the data filter is done. Is there any way to have them share the time slices so they each run for a short period of time and then allow the other filter to run. Thanks.



Would you mind to describe more about your implementation?
Do these filters in the same application and called by thread?

Please noticed that for the jobs in the same stream will be executed in order.

You can check above document for some information first.

I have 2 Cuda kernels. One is used as a filter to process the video frames with a 3D table using gstreamer and nvivafilter with our custom library. The other is called by our player app when a new 3D table needs to be created. As you can tell they are in separate threads. I was just trying to reduce the delay in the processing of video frames when the new 3D lUT is being created. Thought I could have the Cuda kernel that creates the 3D table yield after X percentage of the 3D table has been created to allow Cuda kernel used for the video filter to process some video frames.


Sorry for the late update.

Would you mind to track the GPU utilization of each process.
Please noticed that GPU resource is limited on the Jetson platform.

If an app is fully occupied the GPU, then you won’t be able to share the resource to another application.