Why I can't lauch the Nv12ToBgra32 kernel function with different streams parallel?

I create 2 streams for 2 videos load and display with the sample called AppDecGL.

It is shows that the two streams do not work parallel with the analysis of Nsight.

What should I do?


What does stream term refer here: cuda stream (CUstream object) or video stream?

To run two video streams in parallel, you should create two cuda stream object and should pass one cuda stream object to cuvidMapVideoFrame() and Nv12ToBgra32 kernel for first stream and 2nd cuda stream object to another video stream. Without separate cuda stream, kernels would run on the null cuda stream which would serialize the execution.

Let us know if this suggestion helps you.

Ryan Park