Visual Studio 2013 + x64 + CUDA 7.0 + std::thread == Fail ??

Hi all,

I have the strangest problem. I would like to use std::thread in my CUDA 7.0.027-enhanced project but Visual Studio won’t let me. When I generate a new CUDA-project and replace the code of kernel.cu with the following minimal example:

#include <thread>
class MYCLASS
{
public:
	void testfunc(int a){}
};
int main()
{
	MYCLASS m;
	std::thread test(&MYCLASS::testfunc, m, 1);
}

it compiles very nicely using Win32 as target platform (or x64 and Win32 when using a standard C++ project without CUDA). But as soon as I switch to x64 I get a strange error:

D:\Eigene Dateien\Visual Studio 2013\Projects\test4\test4>"D:\Programme\Nvidia\CUDAToolkit\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2013 -ccbin "D:\Programme\VisualStudioProf2013\VC\bin\x86_amd64"  -ID:\Programme\Nvidia\CUDAToolkit\include -ID:\Programme\Nvidia\CUDAToolkit\include     --keep-dir x64\Release -maxrregcount=0  --machine 64 --compile -cudart static     -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi  /MD  " -o x64\Release\kernel.cu.obj "D:\Eigene Dateien\Visual Studio 2013\Projects\test4\test4\kernel.cu" 
D:\Programme\VisualStudioProf2013\VC\include\functional(1148): error : no instance of overloaded function "std::_Pmd_wrap<_Pmd_t, _Rx, _Farg0>::operator() [with _Pmd_t=void (MYCLASS::*)(int), _Rx=void (int), _Farg0=MYCLASS]" matches the argument list
C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 7.0.targets(593,9): error MSB3721: The command ""D:\Programme\Nvidia\CUDAToolkit\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2013 -ccbin "D:\Programme\VisualStudioProf2013\VC\bin\x86_amd64"  -ID:\Programme\Nvidia\CUDAToolkit\include -ID:\Programme\Nvidia\CUDAToolkit\include     --keep-dir x64\Release -maxrregcount=0  --machine 64 --compile -cudart static     -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi  /MD  " -o x64\Release\kernel.cu.obj "D:\Eigene Dateien\Visual Studio 2013\Projects\test4\test4\kernel.cu"" exited with code 2.

Does anybody have an idea what could be going wrong?

try telling nvcc to compile for c++11

-std=c++11

I forget exactly how to do that in visual studio, I believe there is a customization parameter in the CUDA C/C++ project settings, i.e. a checkbox that will add this switch to the nvcc command line for you.

Unfortunately, that did not work

cl : Command line warning D9002: ignoring unknown option '-std=c++11'

According to some guy from stackoverflow this option is only valid for GCC/CLang while Visual Studio uses C++11 by default.

Hmm, sorry, I sent that option to cl.exe, the C++ compiler, which was clearly wrong.
Setting it for nvcc does not improve the situation:

CUDACOMPILE : nvcc warning : The -c++11 flag is not supported with the configured host compiler. Flag will be ignored.

Yes, I agree. My suggestion was not correct (I was thinking of nsight EE which does have such a checkbox.)

I’m able to reproduce your issue but at the moment I don’t have an explanation.

Thank you for going the length to reproduce the error! I had hoped this would just be some simple mistake from my part. Especially since this seems not to be such an exotic combination (nvcc, VStudio, x64 + std::thread).

Hi, I’m stuck on this exact same issue. Do you know of any workarounds or fixes?

The issue is not fixed in CUDA 7.0 or CUDA 7.5 as far as I can tell. I believe it will be fixed in the next major CUDA release.

Until then, the usual suggestion when incompatibilities like this arise is to separate the code that uses the incompatible element (std::thread when the thread function is in a class, in this case) into a separate .cpp file in your project, and use wrapper functions or some other method to connect that with the CUDA code that is in .cu files in your project.

Just found a simple workaround myself. Instead of

std::thread test(&MYCLASS::testfunc, m, 1);

simply use

std::thread test([](MYCLASS *m, int a) { m->testfunc(a); }, m, 1);

or:

std::thread test([m] { m->testfunc(1); });

no error on host-compiler(vc++) renaming xxx.cpp .

I’ve reported this issue to NVIDIA as a bug.