Outputting image

How can I output a buffer from a kernel as a pixel buffer to a given monitor connected to that GPU. I would like to output it directly, without bringing it back to the CPU.

Use the CUDA-graphics interop sample codes as an example. If you use OpenGL, look for the sample codes that use a OpenGL PBO as the interop vehicle.

Here’s a condensed example:

[url]Sorting Pixels from opengl using CUDA and Thrust - Stack Overflow

The use of PIXEL_UNPACK_BUFFER_ARB there indicates the use of a OpenGL PBO.

I am not using OpenGL, is there a pure CUDA solution?

No. You would have to use a CUDA-graphics interop method. That would be CUDA+OpenGL, CUDA+DirectX, or very recently, CUDA+Vulkan.