Linking errors with CUDA 4.0 but works fine with CUDA 2.2 Problem related to dynamic runtime librari

Hi everyone,

I’m trying to build a Visual Studio 2008 solution (windows 7 pro 64 bits) that has almost 300 hundred projects. We’re using dynamic libraries and I’m wondering whether it’s related to the flag /MD. I’ve seen similar posts here with Thrust for instance (http://forums.nvidia…howtopic=195470). This problem is solved now (not sure if it was nvcc or[font=“arial, verdana, tahoma, sans-serif”] the SDK Marching cube example itself). And I’ve seen quite a few problems everywhere with /MD compilation. But it may be another issue. [/font]I also need to say that the whole solution compiles with CUDA 2.2 (which is the version used previously). I just joined the team and I’d like to update to CUDA 4.0 for obvious reasons.

[font=“arial, verdana, tahoma, sans-serif”] [/font]

[font=“arial, verdana, tahoma, sans-serif”]Here the type of errors I get:[/font]

[font=“arial, verdana, tahoma, sans-serif”]

156>solver_cudad.lib(solver_cuda_generated_LibCuda.cu.obj) : error LNK2005: "void __cdecl function1(float *,float *)" (?function1@@YAXPEAM0@Z) already defined in LibThing_test.obj

155>Embedding manifest...

156>solver_cudad.lib(solver_cuda_generated_LibCuda.cu.obj) : error LNK2005: "void __cdecl function2(float *,float *,float *)" (?function2@@YAXPEAM00@Z) already defined in LibThing_test.obj

111>QtHoverPoints.cpp

156>solver_cudad.lib(solver_cuda_generated_LibCuda.cu.obj) : error LNK2005: "void __cdecl function3(float *,float &)" (?function3@@YAXPEAMAEAM@Z) already defined in LibThing_test.obj

156>solver_cudad.lib(solver_cuda_generated_LibCuda.cu.obj) : error LNK2005: "void __cdecl function4(float *,float,float *)" (?function4@@YAXPEAMM0@Z) already defined in LibThing_test.obj

156>solver_cudad.lib(solver_cuda_generated_LibCuda.cu.obj) : error LNK2005: "void __cdecl function5(float *,float &)" (?function5@@YAXPEAMAEAM@Z) already defined in LibThing_test.obj

156>...\build\Build_win64\bin\Debug\test_solver_lib.exe : fatal error LNK1169: one or more multiply defined symbols found

[/font]

[font=“arial, verdana, tahoma, sans-serif”] [/font]

[font=“arial, verdana, tahoma, sans-serif”]I’m not familiar with the windows development platform (I usually develop on Mac or linux) so I might be missing something and could be a problem quite simple to solve. And since everything works fine with CUDA 2.2, I tend to think that our project’s configuration is not bad. Are you aware of any change in nvcc between 2.2 and 4.0 related to the rules of compilation? For the sake of completeness, I generate the makefiles using CMake 2.8.4 (I say this because I think it means that I’m using CMake rules for compiling .cu files and not the rules given by NVIDIA with his toolkit I believe, but maybe someone can confirm this).[/font]

[font=“arial, verdana, tahoma, sans-serif”] [/font]

[font=“arial, verdana, tahoma, sans-serif”]Maybe I should add a few additional pieces of information: [/font][font=“arial, verdana, tahoma, sans-serif”]LibThing_test.cpp includes “LibA.h” and LibA.h includes “LibB.h”. And S[/font][font=“arial, verdana, tahoma, sans-serif”]olver_cuda[/font][font=“arial, verdana, tahoma, sans-serif”].cu includes “LibB.h”. function1 to function5 are all defined in LibB.h. However, if the organisation of my files was bad, why does it compile with CUDA 2.2?[/font]

[font=“arial, verdana, tahoma, sans-serif”] [/font]

[font=“arial, verdana, tahoma, sans-serif”]Any help would be very much appreciated.[/font]

Well, the support for C++ got a lot better with 2.3, and then additional C++ features have been added since then for sm_20 devices (virtual functions and such).

As far as the CUDA build rules in CMake, CMake passes the same host compiler flags to nvcc as are passed to gcc via -Xcompiler.

Thanks for your reply.

I initially thought I was doing something wrong with the #include. I have a .cu file that includes a given “header.h”. And I have .cpp files in other projects (in the same solution) including the same header.h. So I could understand there was a conflict at linking since my .cpp are compiled with cl and .cu with nvcc. So when trying to link afterwards, it sees 2 inclusions, right? Or the linker is smarter than that? And in that case, why does this work with CUDA 2.2? So before changing the whole structure of my code, I’d like to make sure to understand everything.

Anyone?

I should add 2 things. First, it may very well be linked to http://forums.nvidia.com/index.php?showtopic=205444. It seems that even SDK examples do not support /MD (at least for matrixMul).

And second, I realised that if I do the following:

  1. compile the whole solution without CUDA first (just a couple of projects among the 300 hundreds are not compiled, that is the ones using CUDA)
  2. turn CUDA back on in my cmake configuration
  3. compile again (no clean in between of course)

it works! No problem with linking this time.

Any idea about why I’m unable to compile the whole solution from scratch?

I found the cause of my problem and it’s not related to /MD vs /MT flags.

Here what we do in our framework. In a header file functions.h, we have functions defined like this:

FUNCTIONTYPE void name_function(...);

... // other declarations

FUNCTIONTYPE void name_function(...)

{

	// implementation here

}

When our application is compiled without CUDA, FUNCTIONTYPE is defined as “inline”. And that’s it. When CUDA is enabled, the two types are built. First, FUNCTIONTYPE is defined as “device”. And functions.h is included in CudaLibrary.cu which is compiled by NVCC to build a library. In another translation unit, we build a similar library without CUDA support and a library.cpp includes functions.h as well but with FUNCTIONTYPE set as inline this time. We then face link errors LNK2005 described in my first post.

Again, this setup works fine with CUDA 2.2 but fails with CUDA 3.2 and CUDA 4.0 (I couldn’t test the versions in between so I don’t know when it starts failing). Also, if I compile the application without CUDA support first, then turn the flag USE_CUDA to ON and compiles again, the build succeeds.

I now found a fix and part of the explanations are here: http://stackoverflow.com/questions/4193866/linker-errors-2005-and-1169-multiply-defined-symbols-when-using-cuda-device. The answer is to define FUNCTIONTYPE as “device inline”.

  1. I don’t really understand why you need to add “inline”. I thought the device functions were always inlined by NVCC

  2. how come it works with CUDA 2.2? Is there any change made to NVCC in the way it handles inlined functions?

I’m glad I could fix my problem. And that’s why I put it here, so that everyone can read it in the event they would be confronted to this problem. But I would be even happier if I could really understand the reason.

It used to be the case that the compiler had to inline all functions. But ever since an ABI was added for compute capability 2.x and up (in CUDA 3.1 if I recall correctly), the compiler no longer has to inline functions, as it can also call functions. This allows for recursive functions, for example. The compiler still uses inlining of functions fairly aggressively. There are new function attributes that allow a programmer to control the inlining: noinline, forceinline.

On windows I can only link with /MT or /MTd.

However If I have separate projects within my solution I can compile the MFC project with /MD and the CUDA project with /MT which is the only way I can make it work.

Is it supposed to be as simple as changing the switch or do I have to do more to make it work with /MD?

Thanks your your reply Njuffa! I know why it used to work with the old version of CUDA. Great!

Not sure why I need to define all functions as inlined to make it work though. Isn’t the linker smart enough to make the distinction between host functions and device functions? Or it just checks the name and since it finds identical names it complains about it? I understand by that adding inline, the functions become defined with internal linkage and stay in the scope of each translation unit. The linker has no knowledge of them and it works fine. But why do I need to inline them all?

@akavo: I can compile everything with /MD, my initial guess that my problem was related to this flag was wrong. More exactly I had added a new project in my visual studio solution from the CUDA SDK just for testing the CMake configuration. This example of the SDK makes use of cutil and my compilation of cutil was done with /MT. This is why I had errors related to /MD /MT in the beginning and it got me confused with the errors I mentioned in my first post. The flag /MD is passed to NVCC (default behaviour) and it works fine.