Build Error MSB3721 When calling object method within kernel, using compiler directives

RaenirSalazar · November 16, 2015, 12:59pm

So I have a problem with building a CUDA project.

I’m trying to parallelize a physics engine without having to massively rewrite my code, so I want to add in directives to define my factions as define CUDA_CALLABLE_MEMBER host device in order to reduce the amount of code duplication.

Basically I want my .cu files to call methods from my .h headers.

For example:
test.h

#include <iostream>

	#ifdef __CUDACC__
	#define CUDA_CALLABLE_MEMBER __host__ __device__
	#else
	#define CUDA_CALLABLE_MEMBER
	#endif

	class helloWorld
	{
	public:
		CUDA_CALLABLE_MEMBER helloWorld() {};
		CUDA_CALLABLE_MEMBER void boo();

		//__host__ __device__ helloWorld() {};
		//__host__ __device__ void boo();
	};

test.cpp

#include "test.h"

	CUDA_CALLABLE_MEMBER void helloWorld::boo()
	{

	}

test.cuh

#pragma once
	#include <cuda.h>
	#include "cuda_runtime.h"
	#include "device_launch_parameters.h"

	class test
	{
	private:
		int SIZE;
	public:
		test();
	};

test.cu

#include "test.h"
	#include "test.cuh"

	__global__ void myAddKernel(helloWorld* hw, int *c, const int *a, const int *b, int n)
	{
		int i = blockIdx.x*blockDim.x + threadIdx.x;

		if (i < n)
		{
			//hw->boo();
			c[i] = a[i] + b[i];
		
		}
	}

	test::test()
	{
		SIZE = 1024;

		helloWorld* hello = new helloWorld();

	}

The line hw->boo(); produces the following error:

Error 42 error MSB3721: The command ““C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\nvcc.exe” -gencode=arch=compute_20,code="sm_20,compute_20" --use-local-env --cl-version 2013 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64” -I\Dependencies\ -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include” -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -G --keep-dir x64\Debug -maxrregcount=0 --machine 64 --compile -cudart static -g -DWIN32 -DWIN64 -D_DEBUG -D_CONSOLE -D_MBCS -Xcompiler “/EHsc /W3 /nologo /Od /Zi /RTC1 /MDd " -o x64\Debug\test.cu.obj “H:\Projects\Concordia\COMP 426 - Multicore Programming\bPhysicsEngine2D\bPhysicsEngine2D\bPhysicsEngine2D\test.cu”” exited with code 255. C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V120\BuildCustomizations\CUDA 7.5.targets 604 9 bPhysicsEngine2D

Googling shows my problem seems to be different from all others. If I comment out that line it compiles fine. If I rewrite it to not bother with .h/.cpp and just use .cuh/.cu then it also compiles and works.

But I would very much like to use regular C++ h/cpp files for time saving reasons.

Additionally, I found the error goes away when:

If Boo() is defined inline in my .h file as

CUDA_CALLABLE_MEMBER void boo() { };

Then it compiles.

But defining it ‘properly’

CUDA_CALLABLE_MEMBER void helloWorld::boo()
{

}

And I get the same error?

Any help would be appreciated. :)

Robert_Crovella · November 16, 2015, 1:11pm

A .cpp file is by default delivered directly to the host compiler.

The host compiler doesn’t understand the host device syntax.

If you have a properly configured CUDA project in MS VS, you can also use the -x cu switch to cause your .cpp file to be treated as a .cu file.

It’s not clear what you want or what your objection is to naming the file with a .cu extension.

Likewise, if you annotate a .h file with host device, and then include that in a .cpp file, it won’t compile.

If you use a method like this:

http://stackoverflow.com/questions/32014839/how-to-use-a-cuda-class-header-file-in-both-cpp-and-cuda-modules

then you can put host device in a header file, and have it be usable in “ordinary” .cpp files (with no CUDA content) as well as .cu files.

RaenirSalazar · November 16, 2015, 2:36pm

I’m a little confused, in the link you gave me, it uses the EXACT method I am using and it gives that error, you did see that right?

My objection to using the .cu extension is that I don’t per se want to permanently turn my project into a cuda project (Which doesn’t work with VS2015), but only temporarily and ignore CUDA when I don’t want to use CUDA.

I’ll try out -x cu when I’m home though. :)

Robert_Crovella · November 16, 2015, 11:03pm

Yes, sorry, my previous comment was off-base.

Your error output from VS is less than helpful. VS errors consist of 2 parts in the CUDA toolchain: an actual error output from the cuda tool (nvcc in this case) as well as a follow-up error message from Visual Studio executive stating that the subtool exited with an error. You have provided the 2nd part, but not the first part. If the first part is not showing up in your VS console output, please modify the VS settings to provide more verbose output.
This line of code is a function call:

hw->boo();

Since this function call is originating from device code, you have 2 options: A. Use ordinary compilation (which is how your project is set up as indicated by the --compile switch), and provide all necessary device code in a single compilation unit. You are currently compiling this way, but since the actual boo function is in another compilation unit, it won’t work. This is the proximal reason for the error (which would be more evident with the verbose output.) or B. Use relocatable device code compilation with device linking, in which case you can call and link device code in one compilation unit from/to device code in another compilation unit (which is currently the way your code is structured, but your project is not set up to perform this type of compilation).

To enable the correct mode for compilation (called CUDA separate compilation) look for the “Generate Relocatable Device Code” option in VS project CUDA properties. If you’re still having trouble, please enable the verbose output from VS.

RaenirSalazar · November 16, 2015, 11:22pm

The only new error is:

But that’s from the relocatable code I imagine. Turning on verbosity to both Detailed and Diagnostic has not shown any additional error messages.

Robert_Crovella · November 17, 2015, 3:04am

That is the error I would expect. You can make this error go away by compiling a relocatable code project instead of your current project setup.

RaenirSalazar · November 17, 2015, 2:00pm

How would I go about setting this up? This is the first I’m hearing of it.

Robert_Crovella · November 17, 2015, 2:33pm

Please re-read my comment #4 in this thread. I give a basic description of how to modify a VS project to set this up. For a more complete description, I would search for a Visual Studio sample cuda project that uses relocatable device code with separate compilation and linking, such as the simpleSeparateCompilation sample project, and refer to that for typical project settings:

[url]http://docs.nvidia.com/cuda/cuda-samples/index.html#simple-static-gpu-device-library[/url]

You can also read the nvcc manual section that pertains to separate compilation and linking:

[url]NVCC :: CUDA Toolkit Documentation

And there are plenty of questions and answers about it on various web forums if you want to search for those.

RaenirSalazar · November 17, 2015, 3:28pm

You mean this?

I clearly already said I tried this.

The example in the second link doesn’t seem to match how my code is organized. How or which commands I am supposed to use to achieve the result I wish also are not clear.

Suppose I have a regular C++ header file with my method definitions and then the implementations in my CPP file, I want a separate .cu file and kernel to be able to call methods from that object.

What you linked seems to have the definitions in a .h file, but the implementations nevertheless still in the .cu, which I know already works, but I dislike having to do it that way.

Robert_Crovella · November 18, 2015, 4:08am

Here’s what worked for me in VS2013, CUDA 7.0, win7 x64.

Create a new CUDA project.
Project…Properties…CUDA C/C++… modify the project type to x64 and set generate relocatable device code to Yes.
Set the active project to x64 Release
Add your files to the project. Since the project creation in step 1 creates a default kernel.cu file, I just replaced the default code in kernel.cu with the code from your test.cu into that file and then added your other 3 files. (and un-comment the troublesome line of code you have commented out.)
If you use the ordinary method to add test.cpp to the project, then after adding it you will need to change its type from an ordinary C/C++ file to a CUDA C/C++ module. Right click on the file in the solution explorer window, go to Configuration Properties…General and change the Item Type from C++ to CUDA C/C++ Then go to CUDA C/C++ propeties on this file and add -x cu to Command Line…Additional Options field.
For the file properties for both kernel.cu (your test.cu file) and test.cpp, right click on the file in the solution explorer window, and change the CUDA C/C++…Common…Generate Relocatable Device Code entry to Yes.

That should be everything needed setup-wise. However the files you have shown are not a complete project, so I chose to add the following line:

int main() {}

at the end of your test.cpp file. With that I was able to successfully compile the project with no errors. Here is the console output from the compile command:

1>------ Rebuild All started: Project: test10, Configuration: Release x64 ------
1>  
1>  c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include"     --keep-dir x64\Release -maxrregcount=0  --machine 64 --compile      -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi  /MD  " -o x64\Release\kernel.cu.obj "c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10\kernel.cu" -clean 
1>  kernel.cu
1>  
1>  c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64"  -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include"     --keep-dir x64\Release -maxrregcount=0  --machine 64 --compile  -x cu     -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi  /MD  " -o x64\Release\test.cpp.obj "c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10\test.cpp" -clean 
1>  test.cpp
1>  Compiling CUDA source file kernel.cu...
1>  
1>  c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -rdc=true -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include"     --keep-dir x64\Release -maxrregcount=0  --machine 64 --compile -cudart static     -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi  /MD  " -o x64\Release\kernel.cu.obj "c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10\kernel.cu" 
1>  kernel.cu
1>  Compiling CUDA source file test.cpp...
1>  
1>  c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2013 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64" -rdc=true -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\include"     --keep-dir x64\Release -maxrregcount=0  --machine 64 --compile -cudart static -x cu     -DWIN32 -DWIN64 -DNDEBUG -D_CONSOLE -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Zi  /MD  " -o x64\Release\test.cpp.obj "c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10\test.cpp" 
1>  test.cpp
1>  
1>  c:\Users\robertc\documents\visual studio 2013\Projects\test10\test10>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\nvcc.exe" -dlink -o x64\Release\test10.device-link.obj -Xcompiler "/EHsc /W3 /nologo /O2 /Zi  /MD  " -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\lib\x64" cudart.lib kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib  -gencode=arch=compute_20,code=sm_20  --machine 64 x64\Release\kernel.cu.obj x64\Release\test.cpp.obj 
1>  cudart.lib
1>  kernel32.lib
1>  user32.lib
1>  gdi32.lib
1>  winspool.lib
1>  comdlg32.lib
1>  advapi32.lib
1>  shell32.lib
1>  ole32.lib
1>  oleaut32.lib
1>  uuid.lib
1>  odbc32.lib
1>  odbccp32.lib
1>  kernel.cu.obj
1>  test.cpp.obj
1>  LINK : /LTCG specified but no code generation required; remove /LTCG from the link command line to improve linker performance
1>  test10.vcxproj -> c:\Users\robertc\documents\visual studio 2013\Projects\test10\x64\Release\test10.exe
1>  copy "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\cudart*.dll" "c:\Users\robertc\documents\visual studio 2013\Projects\test10\x64\Release\"
1>  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\cudart32_70.dll
1>  C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\cudart64_70.dll
1>          2 file(s) copied.
========== Rebuild All: 1 succeeded, 0 failed, 0 skipped ==========

Just to re-emphasize a point that I made earlier:

You have a device function that you are calling from CUDA device code (in test.cu) in a separate compilation unit (test.cpp). The only way this will work is if this compilation unit (test.cpp) is handled properly by the nvcc compiler. The default behavior for nvcc is to pass a file with a .cpp extension directly to the host compiler. The host compiler will not generate the proper device code (and your CUDA_CALLABLE_MEMBER macro will evaluate to whitespace). There are at least 2 ways to address this. You can either change the file extension to .cu, but you seemed to be opposed to this. The other option is to override the nvcc default behavior by adding -x cu to the command line. That is the option I’ve demonstrated here.

Topic		Replies	Views
unhelpful build error CUDA Programming and Performance	10	12794	November 5, 2014
CUDA 6.5 building problem CUDA Setup and Installation	21	10438	March 22, 2021
Simple CUDA Wizard for Visual Studio 2005 CUDA Programming and Performance	100	174507	April 8, 2012
[HELP] CUDA 3.2 on Windows 7 x64 with VC++ 2008 Express CUDA Programming and Performance	11	22329	December 12, 2010
Problems with CUDA10 & VS2017 - Build & Clean errors CUDA Setup and Installation	3	1894	January 21, 2021
Simple CUDA build rule for Visual Studio 2005 CUDA Programming and Performance	28	83873	June 9, 2009
CUDA 4.0 with VS2010 Strange issue with CUDA on VS 2010 CUDA Programming and Performance	9	3959	August 4, 2011
How to fix Error MSB3721 CUDA Setup and Installation	15	34567	November 26, 2023
NVCC forces c++ compilation of .cu files CUDA Programming and Performance	11	25421	December 11, 2011
Compiling Nvidia CUDA 10.1 in Visual Studio 2019 (Enterprise) project Fails Nsight Visual Studio Edition	21	11382	October 3, 2021

Build Error MSB3721 When calling object method within kernel, using compiler directives

Related topics