I have a question.
I am implementing an image processing program in CUDA.
I want to ouput result image in device memory to some display device directly without passing through host memory.
The input array in my program is processed several times by various kernel function. so several output images are shown. And after making them into one image in device module, I have to output the result image to some display device directly.
can it be possible?
Thanks in advance.