Does vpiStreamSync on wrapped CUDA stream synchronize underlyng CUDA stream?

Consider the following usage pattern:

  1. Create CUDA stream, create VPI stream wrapping that CUDA stream
  2. Allocate CUDA memory (e.g. via OpenCV’s cv::cuda::GpuMat) , create VPI image wrapping that memory
  3. Use newly wrapped VPI image as an output location for some VPI function (that is, fill the data in that image)
  4. Do some processing on raw CUDA memory (e.g. memcpy to host, do something, memcpy back). Do CUDA synchronization of stream.
  5. Use same wrapped VPI image from (2) as an input to some VPI function.

Currently I’m facing various segfaults when that pattern is running in multithreaded environment. However, if I’ll do VPI synchronization on (4), everything seem to work more or less correctly (at least it does not segfault).

So, the question is: will VPI synchronization on wrapped stream also do synchronization on underlying CUDA stream?

Related question: will indirect VPI synchronization via vpiSubmitHostFunctionEx have same semantics as-if CUDA stream was also synchronized via that call?


Yes. it will.
Do you meed any error after using vpiStreamSync?


No, but I want to know whether it’s just a coincidence or syncing VPI stream explicitly synchronizes CUDA stream.

Same applies to vpiSubmitHostFunctionEx, as I use it to do stream synchronization in program that is built around boost::fibers.


You can find a document for vpiStreamSync below:

Basically, it will synchronize all the tasks attached to the stream.
So it will also flush the CUDA jobs that have been attached to the stream.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.