So, I have written a program that uses CUDA and everything is great. It loads a predetermined file, performs calculation on the data from the file, and then saves the results to another file. It runs much faster than the equivalent CPU program and I am very happy. I use Visual Studio 2008 and the 4.0 toolkit. The code is compatible with compute capability 1.0 but I have a card with 2.0 capability (not that I think this matters).
Now here is the thing, how do I make this same executable run on other windows systems that have CUDA-capable cards? I have been very unsuccessful in finding any information on this. And my attempts have failed (I have a few win7-x64 machines with different hardware at my disposal). Is there a way to compile the code once so that it includes all of the information it would need to run on any other system? I cannot install visual studio and the CUDA toolkit on every machine that my program will end up on; it just isn’t practical. I would ideally like to have it work on all compute capabilities, for both 32bit and 64bit systems, and for XP/Vista/7. I don’t actually mind if I have to have a different executable for each of those parameters so long as I can be sure that there is an executable that will work. I found some references to making a device code repository but I am not sure that is what I need and I couldn’t figure it out.
Your help is much appreciated. Thanks!