I’m trying to build a program that combines CUDA and OpenGL and uses concurrent copy and execute on kernels.
It seems to mostly work fine except for the call to cudaGraphicsUnmapResources on a map texture that despite accepting a stream as input will not allow a copy to device to happen in parallel.
Anyone know if there is a way around this?
I’m using CUDA 4 with 280 drivers at the moment.
Thanks