CUDA best practices: development to production pipeline

Background

I have been working with some CUDA development of server-based software (not a desktop app) and I have found that development under Windows is generally more easy than under Ubuntu.

Here are the advantages of developing CUDA under Windows:

  1. Drivers installation is easy. It’s just download > install > reboot.
  2. The Nsight plugin for Visual Studio seems to be more up to date (latest release: version 2021.2.1, June 2021) than the Nsight plugin for Eclipse (latest release: version 9.0, April 2019). The Nsight plugin for Visual Studio provides quite a comfortable development environment; among other things it allows to set debugger breakpoints inside kernels which can be very handy.

Motivation

However, when it comes to deploying the developed CUDA code to production servers (for server-based apps), it gets more confusing.

First, using a flavor of Linux on a server is an obvious choice as it is an industry standard for many server-based apps (LAMP, Ruby on Rails, etc.). I don’t have anything against using Windows on a server, but it seems to be a specific case for organizations that know why they do that.

Second, there are signs of that Linux is used to run CUDA code in production. For example, there are two AWS Amazon Machine Instances for CUDA (with pre-installed drivers) and all of them use Linux to run CUDA code on GPU-optimized AWS instances. There are no AMIs that use Windows.

Question

What is the industry best practice to deploy CUDA-powered server-based software to production? Is it usually developed under Windows and then compiled for a Linux-based production server?

Responses are likely to be informed by personal experience which differs widely. You would want to treat the following as food for thought rather than firm recommendations. I am an old-school developer who used to develop with CUDA in roughly equal parts on both Windows and Linux (RHEL in particular); now retired and limited to Windows for the time being. I am familiar with cross development from working in the embedded space for several years.

[1] Without statistics to back me up, I’d say there is more CUDA-accelerated production code deployed on Linux platforms than on Windows platforms.

[2] Generally speaking, you would want to avoid cross development unless strictly necessary. Develop on a platform close or identical to the one you deploy on. If you need to support both Windows and Linux you will obviously have to be flexible about the development environment, and/or abstract OS-differences away as much as possible

[3] For cloud deployment of CUDA-accelerated applications on Windows you would probably want to take a look at Azure. I have no personal experience with that but it seems like the obvious alternative to AWS if the target platform is Windows.

[4] Personally I have found a Linux development environment more productive than a Windows development environment, but I did not and do not use any IDEs. I found development effort on Windows generally about 20% higher compared to Linux.

[5] Personally I have found Ubuntu problematic (various idiosyncracies are irritating to me) and would suggest looking at RHEL for server deployment as an alternative.

I’m not sure why you think this. The plugins ship with each version of Cuda Toolkit - here is the installed software list for Eclipse on a machine with 11.2 installed: