I have noticed in my applications that I cannot call cudaDeviceSetLimit with cudaLimitMallocHeapSize before setting the OpenGL device for interop. If I do, I get the error that an application is already using the device. Also, once I’ve set the OpenGL device I cannot call cudaDeviceGetLimit with cudaLimitMallocHeapSize or else I get an unknown error. I am able to set and get the heap size and use it if I don’t do OpenGL interop.
I can, however, still use the heap inside of kernels (most likely with the default heap size??) - compiled with sm_20 of course - with OpenGL interop and the OpenGL device set.
Is this the expected behavior? I could not see anything about it in the programming guide.
Hardware: A single GTX 580
Software: Cuda 4.0, Latest