I’m trying to link two shared libraries containing different versions of my CUDA kernel to my application. The application doesn’t have any CUDA dependency, its all in the shared libraries. Each shared library queries the number of devices on startup using cudaGetDeviceCount and cudaGetDeviceProperties.
This works fine if i build them using the same configuration. But if i build the first library with emulation mode and the second with release mode, both kernels won’t be executed. The kernel call returns immediately without any calculations.
Is it possible to have both configurations in one application?
I’m not sure if what you describe is possible. Or, if you find a way in CUDA 1.0, it will require a different solution in 1.1 since they have changed how emulation mode is handled. If I had to guess, I think the only way to do it would be to dynamically load your libraries with dlopen and not actually link the application, but I could be wrong.
Even if you could get the app to execute, you (may) still have one major problem: one thread = one CUDA context. So you would need to make calls to the different libraries in separated threads.