Any advice for best pipeline of least latency to display NV12 textures to monitor (Windows 10)

Hello everyone,

Do you have any advice on the best pipeline for achieving the lowest latency when displaying a UVC camera NV12 stream on a monitor in Windows 10?

I know NVIDIA provides the NPP, NPP-PLUS, and CV-CUDA libraries, which offer functions for NV12 → RGB conversion.

However, that’s just part of the story. I need a pipeline optimized for low latency with minimal CPU involvement.

I was thinking of using Media Foundation to parse the UVC stream and dump NV12 frames directly into the GPU’s VRAM using pinned memory, then run an NPP function to convert NV12 → RGB on the GPU.

After that, I’m a bit lost, what’s the best way to display it on the screen? Should I use a DXGI swap chain, and if so, how would that work?

Thanks!

I so far used OpenGL for displaying 2D Cuda data by mapping a texture as writeable by Cuda and using it in my scenes directly or with a small custom shader.

(The scene would just be a rectangle or two triangles = rectangle cut diagonally. The scene then would be put into application windows. Custom shaders can recompute coordinates for dynamic resizing.)

That was always working and good enough. No more copying needed.

1 Like