[WARN ] 2023-12-22 02:19:36 VPI_ERROR_INVALID_OPERATION: PVA is not available and may be oversubscribed in the system: PvaError_DeviceUnavailable

I am trying to use VPI with multiprocessing, specifically in python and not c++. (Jetpack 5.1 with VPI 2.3.9). I have already created a forum post about this earlier (Unable to use multiprocessing or GNU Parallel with VPI - #9) but the solution suggested was for c++ code and not python since vpi.Image or vpi.asimage don’t take backend as input rather in python it is done via context (with). This is what I was already doing since I modified the sample codes

one of the stack overflow post questions (CUDA ERROR: initialization error when using parallel in python - Stack Overflow) had similar issues and one the suggested answers about changing start method to spawn from fork was helpful. With this change I have been able to use multiprocessing but only with pool size of 3 and only using map or map_async

multiprocessing.set_start_method('spawn')

But if set the pool size >=4, I get the following error (even though it is a warning it is not processed)

[WARN ] 2023-12-22 02:19:28 VPI_ERROR_INVALID_OPERATION: PVA is not available and may be oversubscribed in the system: PvaError_DeviceUnavailable             

I am not using PVA as backend and am using CUDA backend as follows

streamStereo = vpi.Stream()
with streamStereo, vpi.Backend.CUDA:
    #load left image using opencv
    # left_vpi_img = vpi.asimage(opencv_img).convert(vpi.Format.U8, scale=1)
   # similarly load right image and move it to GPU as above
   # estimate stereo disparity using vpi.stereodisp 

Using jtop I have monitored GPU memory usage, which shows that I can load more images and improve the performance of this process. Please let me know if you need any additional information or logs

Hi,

The limit of independent processes to subscribe PVA is 4.
If you are not using PVA, please create vpi stream only with the backend you used.

Thanks.

Hi @AastaLLL ,

I already create a vpi stream with CUDA backend as follows

streamStereo = vpi.Stream()
backend = vpi.Backend.CUDA

with streamStereo, backend:
   .... load image... logic as mentioned above

and still get the error as mentioned in the title

Hi @AastaLLL , could you please share any update on this?

Hi,

Sorry for the late update.

In your source, the vpi stream is still created with the default setting (vpi.Stream()).
Please recreate it with the backend you need instead.

For example:
https://docs.nvidia.com/vpi/group__VPI__Stream.html#gad0c55a589ed8f70cdb7700929610a040

// Create the stream for any backend.
CHECK_STATUS(vpiStreamCreate(VPI_BACKEND_CUDA, &stream));

Unfortunately, this configuration is not available for the Python interface.
Does the C++ solution work for you?

Thanks.

Hi @AastaLLL ,

It would be difficult change but before I proceed with rewriting my whole implementation in C++, I would like to know if VPI in C++ can work with threading C++? I would be spending a lot of effort for this so I want to be sure that if switch to C++ I can have multiple threads working with VPI (input file is a custom format so I have to write my own parser in C++)

Also it would be of great help if this issue could be fixed in VPI 3

Hi,

Suppose yes, since Python is a wrapper from the C++ library.
The compatibility should be the same.

Another alternative is to add the support on your own.
VPI Python binding is open-sourced from v3.0 so you should be able to enable the flag option when creating the VPI image.

Thanks.

Hi @AastaLLL , I am starting to port my code from python to C++ but I am facing a issue.

The document pages have been updated to VPI 3.0.0 for e.g → VPI - Vision Programming Interface: Stereo Disparity Estimator

How can I access documentation for VPI 2.3.9?

Hi,

Please check the below link:

https://docs.nvidia.com/vpi/2.3/

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Hi,

Thanks for your patience.

We have some conclusions for the Python interface.
Current Python API can only support up to 4 VPI processes.
Since global context is created at the beginning of VPU module import, the C++ workaround cannot be applied even after the binding is added.

This is the limitation by design.
Thanks.