Cuda 10 & VS2017 & C++17

if I move my VS 2017 C++ project over to C/C++ Language ISO C++17, the .cu files won’t compile any longer, with a message “nvcc fatal : Compiler ‘cl.exe’ in PATH different than the one specified with -ccbin”

The ccbin portion of the command line is "-ccbin “C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\VC\Tools\MSVC\14.15.26726\bin\HostX86\x64”, but I’m pretty sure the cl.exe that is used on the path is different when compiling for C++17.

Help.

CUDA 10.0 supports C++14 but not C++17.

[url]https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#c-cplusplus-language-support[/url]

If you need C++17 host code, put that in a .cpp file. And you may need to put it in a separate project.

If you need C++17 device code, you can compile CUDA with clang. However, I don’t believe CUDA version 10 is supported even in the most recent clang builds. I wanted to try out some C++17 in device code and I got CUDA 9.0 working with clang 6 (however 9.2 didn’t work - I didn’t try 9.1, but from internet searches I don’t believe it works either). I haven’t tried clang 7 (which supports compiling relocatable device code, but I don’t know which version of CUDA it supports) and I don’t think clang 8 has been released yet.

There are some small differences between clang CUDA and nvcc CUDA (some for the worse, some for the better). I had to make a couple of minor changes to my project, but otherwise it should “just work” for the most part. Seems oddly slower, for some reason unknown to me, but since it is just for testing purposes for now, that doesn’t bother me.

https://llvm.org/docs/CompileCudaWithLLVM.html

One example of changes I had to make is calling functions like max/min from host code. In nvcc, you can use the global namespace, just call max(a,b) or ::max(a,b). In clang, you have to call std::max(a,b), which means for host device code you have to either overload the function with host or device, which you can do in clang, but not nvcc, e.g.:

___host__ function1(){ ... std::max(a,b) ... }

__device__ function1() { ... ::max(a,b) ... }

or within the function detect if compiling for device vs host (and clang vs nvcc if you wish) and write different code or make a macro to handle it, which is what these developers did:

http://eigen.tuxfamily.org/bz_attachmentbase/attachment.cgi?id=671

A simple version below:

//compiles in clang CUDA for host code
#if defined(__clang__) && defined(__CUDA__) && !defined(__CUDA_ARCH__)
#define USE_STD_NAMESPACE 1
#endif

#if defined  USE_STD_NAMESPACE
#define __STD__ std
#else
#define __STD__
#endif

___host__ __device__ function1(){ ... __STD__::max(a,b) ... }

I wrote that quickly, but I think that should work too while keeping the code kosher for nvcc as well.

EDIT: I recognize that you are using Visual Studio, you can get clang working with visual studio, though it is a bit tricky. So I guess this is if you really want c++17 device code. :)

Hi, when you use clang for CUDA project, do you manage to use cuda-gdb to debug code inside CUDA kernel? Thanks!

Great question! I don’t know. :) I’ve just been playing around with clang-CUDA for fun.

It’s also entirely possible that standard debuggers (GDB/LLDB) that work with clang would simply just work. But I haven’t tried.

Hi, after a year, how about C++17 compatibility with CUDA? Is there a roadmap about supporting latest C++ standard. constexpr if, new type traits and some new function template like std::invoke are useful.

1 Like

Hi, I take this post again. Are there any news about C++17 support for CUDA? Any roadmaps? It could be interesting using if constexpr and other language improvements.

3 Likes

I’ve found few machines where NVCC fails to compile CUDA examples with
nvcc fatal : Compiler ‘cl.exe’ in PATH different than the one specified with -ccbin
or simply errorlevel 1

Reason was in anaconda setting
HKEY_CURRENT_USER\Software\Microsoft\Command Processor\AutoRun

see

for explanation