I am currently developing an application which does quite a lot of OpenGL interoperability and operates on two sets of buffers. In order to fill all vsync slots it needs to do a bit of computation on the first buffer, then visualize the other buffer, compute, visualize, compute, visualize, and so on.
Currently I have to decompose my computation into smaller chunks to be able to do this. Decomposing the kernels and make them operate on a subset of the data seems to be a waste of programming time. At least if there was a way to tell CUDA to perform operations on the data during a time period before pausing and making it possible to do OpenGL calls to display data, and then continuing the kernel execution, and doing this until the kernel execution has ended.
Guess this could be implemented by telling CUDA to work during a certain amount of milliseconds or that during asynchronous execution an API call could tell CUDA to pause the kernel during flight, copy out the CUDA context to global memory and allow the application to perform OpenGL calls. This would make it a lot easier to make simple programs that execute efficiently on the GPU while still allowing the problem to be subdivided into smaller chunks.
Is this something that is already possible? (How?)
If this is not possible, is this something planned?
Anyone else thinks it could be useful?