Hello, I am trying to warp two 1080p images using the GPU, then pass them to another non-gpu function

Currently the VPI .cpu function is very slow and does not give me access to results in realtime (able to inspect array), this is taking about 50ms for 2 images - which is slower than openCV cuda

Can I have some advice on how to speed this up so I can process images in realtime?

thank you

cv_backend = vpi.Backend.CUDA

img = cap.read()
img2 = cap2.read()

with cv_backend:
** frame1 = vpi.asimage(img[0:1080, 0:1920,:])**
** frame2 = vpi.asimage(img2[0:1080, 0:1920,:])**

with cv_backend:
** output1 = frame1.perspwarp(hom)**
** output2 = frame2.perspwarp(hom)**

with output1.rlock(), output2.rlock():
** plop1=output1.cpu()**
** plop2=output2.cpu()**


Have you maximized the device performance already?
You can find the script to boost VPI performance below:


Hello, I just tried that and it didn’t effect the processing time. I was incorrect its actually 70ms for the output to CPU stage, not 50ms as I originally stated

I have the same problem. With CUDA backend, calling VPI.Image.cpu() took ~50ms. Any ideas to improve it?


Is the 70ms only used for downloading the output to the CPU?
Or it also includes the time that does the perspective warping?


Hi AastaLLL, I have had to move on to the Xavier for our production (same code), but yes was 50ms-70ms just for the CPU output - I forgot to mention its for 2 images being locked for CPU access being timed here not for 1 - but 25ms+ is still not realtime for us. I still have the Nano set up if there is anything else we can try but if this is a limit of the hardware we can move on


Does the same issue occur on Xavier?


