CUDA deployment

I have a CUDA application. I understand that I need to distribute the CUDART32_55.DLL or CUDART64_55.DLL. I can pull these from the “bin” folder on my development machine without any problem.

But, there are some other issues. By default the application is hard-linked to NVCUDA and NVCUVID. This means anyone running it without an NVIDIA driver installed will get a error and not be able to get into the application. Not the desired behavior.

I found a reference that said dynamic loading (via LoadLibrary and I presume Visual Studio delay loading) of NVIDIA DLLs was not supported because of TLS initialization issues. This was a very old posting and I did see a reference that this was fixed in CUDA 1.0 - evidently this was a pre-release problem only.

So, is the correct way to handle this with Visual Studio delay loading? Clearly, if I am calling to CUDART32_55.DLL I can’t handle internal references from it to NVCUDA.DLL. Or are other people with applications handling this with two versions, one for NVIDIA/CUDA support and one without?

I have found enough information to say that distributing some version of NVCUDA.DLL is a hugely bad idea, and I can agree with that. It seems that the “proper” version of this is installed with the driver and should not be bound with the application.