Toolkit on Customer Computer

Hello,
I have the default cuda runtime project from visual studio compiled and moved it to a computer that has a GPU but does not have the toolkit installed. This would seem as though this would be the case in the event of customer distributions. They get the following when running. I however do not get this on my computer, as I have the toolkit. Is this the issue? does the client computer need to have the developer toolkit? is the a way to build it with the project?

Code:
const int arraySize = 5;
const int a[arraySize] = { 1, 2, 3, 4, 5 };
const int b[arraySize] = { 10, 20, 30, 40, 50 };
int c[arraySize] = { 0 };

// Add vectors in parallel.
cudaError_t cudaStatus = addWithCuda(c, a, b, arraySize);
if (cudaStatus != cudaSuccess) {
    fprintf(stderr, "addWithCuda failed!");
    std::cout << "Press any key to exit." << std::endl;
    std::cin.get();
    return 1;
}

printf("{1,2,3,4,5} + {10,20,30,40,50} = {%d,%d,%d,%d,%d}\n",
    c[0], c[1], c[2], c[3], c[4]);

// cudaDeviceReset must be called before exiting in order for profiling and
// tracing tools such as Nsight and Visual Profiler to show complete traces.
cudaStatus = cudaDeviceReset();
if (cudaStatus != cudaSuccess) {
    fprintf(stderr, "cudaDeviceReset failed!");
    std::cout << "Press any key to exit." << std::endl;
    std::cin.get();
    return 1;

}

Thanks

15349c68-598c-4e23-89a8-4b1493334567

The Tesla C2050 (Fermi architecture; compute capability 2.0) is not supported by the current version of CUDA (11.x) and has not been supported by any CUDA version that shipped in the last few years. It has also not been supported by any recent NVIDIA drivers. I forgot how far you would have to go back for the last software stack that offered support for Fermi. I think (vague memory!) CUDA 8 and driver 36x.yy from 2017.

IMHO, there is no point in trying to utilize such outdated GPU hardware unless you are into retro computing. At present, CUDA requires at least compute capability 3.5 (certain Kepler-family GPUs), however compute capability 3.5 is also deprecated which means support for it will likely be removed in the next major CUDA version. If you are looking for affordable GPU hardware that is reasonably future-proof, I would suggest looking at minimum for Pascal-architecture GPUs with compute capability 6.x.

Im fine with that and there has to be a logical cut off point. So i guess to the core of what the question really is, when doing a build and its distributed to a customer, does the customer also need the toolkits installed or does that get packed up with VS? Im running VS2019.

Customers do not need to have the CUDA toolkit installed but you may have to re-distribute certain libraries used by your application that are dynamically linked. Sorry, I don’t know what those are at the moment as I have been using CUDA exclusively on my developer machine in recent years. I notice that NVIDIA has put a list of redistributable components into the EULA:

Awesome. Thanks for you help!

is there a way to check the compute capability 6.x or higher?

Not sure what you mean. Each GPU has a particular compute capability. You can find handy lists at NVIDIA (CUDA GPUs - Compute Capability | NVIDIA Developer) or Wikipedia (CUDA - Wikipedia) or use the TechPowerUp GPU database (GPU Database | TechPowerUp).

I am not making any representations as to how accurate any of those lists are. In some cases one and the same card has used GPUs belonging to different architectures (bad move by NVIDIA).

When you are running on a system, you can use the deviceQuery app that comes with CUDA to find out what GPU is in the system.

Yep just found the devicequery app and I’m calling that from cmd to get the results in my program to determine if there is a compatible card. Is it worth investigating older libraries to capture more GPU cards or is that just opening a can of worms?

By the time NVIDIA drops support for a particular architecture from CUDA, the old hardware is usually hopelessly outdated. GPU architecture still evolves relatively quickly, so at present that typically means about six years after introduction of that architecture. I assume NVIDIA plays that by ear: The fact that compute capability 3.5 is still supported (> 7 years) is probably (speculation!) due to some older super computers using GPUs with that compute capability.

Generally it is best to use the latest CUDA version. But if there is no specific pressure to update (need for new features or architecture support, need for particular bug fixes or performance enhancements) it is usually fine to lag a couple of generations. For example, I am currently using CUDA version 9.2 from 2018 but plan to move to CUDA 11 before year’s end.

Note that NVIDIA’s support typically covers the latest CUDA version, so if you report an issue with an older version the initial response will usually be a suggestion to update to the latest version.

Haha i figured that would be NVIDAs response would be to upgrade. Just so I know, would version 9.2 support that tesla card?

As I stated, I believe that CUDA 8 was the last version that included Fermi support. So, no, CUDA 9.2 does not support the Tesla C2050. Together with an old CUDA version you also need a vintage driver (a driver is included with the CUDA package) because Fermi support was removed from NVIDIA’s drivers at around the same time as it was removed from CUDA.

Supporting old architectures and old software versions comes at a significant cost to the vendor. For example, the test matrix and the required hardware pool for nightly regression testing grows with each additional version, and so does the associated head count for tending to the results (there is a lot that can go wrong when you run thousands of machines). How compelling the cost aspect is can be seen in the case of Microsoft for example, which basically forces customers to upgrade Windows 10 in lockstep, so 90+% (made-up number!) of Windows 10 installations will be at the latest version at any given time.

There is definitely a benefit to using newer versions of hardware and software to customers, too. Older GPUs had a lot of hardware restrictions that made CUDA programming more painful than it should be. This includes hardware hooks for profiling and debugging. I would claim that the first completely “sane” architecture was Maxwell (compute capability 5.x). On the software side, a lot more C++ support (now up to C++17 features, I think) and more libraries have been added in recent years.