Does vpiStreamSync on wrapped CUDA stream synchronize underlyng CUDA stream?

makkarpov · January 8, 2022, 10:04pm

Consider the following usage pattern:

Create CUDA stream, create VPI stream wrapping that CUDA stream
Allocate CUDA memory (e.g. via OpenCV’s cv::cuda::GpuMat) , create VPI image wrapping that memory
Use newly wrapped VPI image as an output location for some VPI function (that is, fill the data in that image)
Do some processing on raw CUDA memory (e.g. memcpy to host, do something, memcpy back). Do CUDA synchronization of stream.
Use same wrapped VPI image from (2) as an input to some VPI function.

Currently I’m facing various segfaults when that pattern is running in multithreaded environment. However, if I’ll do VPI synchronization on (4), everything seem to work more or less correctly (at least it does not segfault).

So, the question is: will VPI synchronization on wrapped stream also do synchronization on underlying CUDA stream?

Related question: will indirect VPI synchronization via vpiSubmitHostFunctionEx have same semantics as-if CUDA stream was also synchronized via that call?

AastaLLL · January 10, 2022, 3:55am

Hi,

Yes. it will.
Do you meed any error after using vpiStreamSync?

Thanks.

makkarpov · January 10, 2022, 9:22am

No, but I want to know whether it’s just a coincidence or syncing VPI stream explicitly synchronizes CUDA stream.

Same applies to vpiSubmitHostFunctionEx, as I use it to do stream synchronization in program that is built around boost::fibers.

AastaLLL · January 20, 2022, 6:04am

Hi,

You can find a document for vpiStreamSync below:
https://docs.nvidia.com/vpi/group__VPI__Stream.html#ga31f569f9da89eabc0249d42746f1c3b7

Basically, it will synchronize all the tasks attached to the stream.
So it will also flush the CUDA jobs that have been attached to the stream.

Thanks.

system · February 9, 2022, 5:58am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.