How to create a static lib using cuda 5.0-6.5 and VS2010 (problem solved and bug found)

Hi, does anyone know how to create a static lib file for device functions? I did this in VS2010, the *.cu file can pass compilation and create a *.lib file, but when I use the *.lib in a new project. The VS2010 tells that it cannot find the device function in the lib file, although I have filled the lib file name in the project properties -> cuda linker -> general -> additinal dependencies and also put the file path to project properties -> cuda linker -> general -> additinal library directories. Can CUDA 5.0 allows this? Thank you.

If CUDA 5.0 cannot make user lib in VS2010, does anyone know any way to make user lib with nvcc? The lib file should be able to be used in new cu project without necessary to recompile the original code. Does anyone have the expierence? Thanks in advance.

Hi, can anyboday help me for the above problems? Thanks.

Have you had a chance to look at the following example?

http://docs.nvidia.com/cuda/cuda-samples/index.html#simple-static-gpu-device-library

Thank you, njuffa. Yes, I read that webpage before, but it is not what I want. That webpage shows how to use the new feature - separate compilation of cuda 5.0, but actually it is not to create static GPU device library. As in c/c++, the static library is to create *.lib files, which can then be directly included and linked to other complied files in new projects without necessary recompiling the lib files or the original *.c or *.cpp files for the lib files. The example in the webpage is just to separate device functions to different cu files, and all cu files will have to be recompiled at all time when we build a project. I know that nvcc allows to compile *.cu files to *.lib file, but I cannot use the created *.lib file, because nvcc cannot find the functions in the *.lib file (Of course I have also include a *.h file, which tells the compiler the prototypes of the functions in the *.lib file). If anyone has experience with static cu lib, please help. Thanks a lot.

Sorry, didn’t mean to be misleading, I was simply going by example name. I haven’t gotten around yet to experimenting with the building of static device libraries.

Yes, the example name is really confusing. I thank you for your kind help. Let us hope that we can hear from some experts on this issue soon.

Does anybody use nvcc to create static *.lib files? Please help.

Here is a simple example of a static device library. I successfully got this to work on Linux, will try Windows next. I created four files.

funcs.h:

#if !defined (FUNCS_H__)
#define FUNCS_H__
__device__ int add1 (int x);
__device__ int add2 (int x);
#endif

func1.cu:

#include "funcs.h"
__device__ int add1 (int x)
{
    return x + 1;
}

func2.cu:

#include "funcs.h"
__device__ int add2 (int x)
{
    return add1 (add1 (x));
}

mainprog.cu:

#include 
#include 
#include "funcs.h"
#define NUM_RESULTS (2)
__global__ void kernel (int x, int *res)
{
    res[0] = add1 (x);
    res[1] = add2 (x);
}
int main (void)
{
    int x = 5;
    int res[NUM_RESULTS];
    int *res_d;
    cudaMalloc ((void **)&res_d, sizeof(res_d[0])*NUM_RESULTS);
    kernel(x, res_d);
    cudaMemcpy (&res, res_d, sizeof(res), cudaMemcpyDeviceToHost);
    cudaFree (res_d);
    printf ("results = %d %d\n", res[0], res[1]);
    return EXIT_SUCCESS;
}

I built a static library of device funtions, and linked it to the main program:

~/tmp $ nvcc -arch=sm_35 -c -rdc=true -o func1.o func1.cu
~/tmp $ nvcc -arch=sm_35 -c -rdc=true -o func2.o func2.cu
~/tmp $ ar r libfunc.a func1.o func2.o
~/tmp $ ranlib libfunc.a
~/tmp $ nvcc -arch=sm_35 -rdc=true -o mainprog mainprog.cu -L . -l func
~/tmp $ ./mainprog 
results = 6 7

Now, I changed func2.c as follows:

#include "funcs.h"
__device__ int add2 (int x)
{
    return add1 (add1 (add1 (x)));
}

Then refreshed the static library, and re-linked the main program:

~/tmp $ nvcc -arch=sm_35 -c -rdc=true -o func2.o func2.cu
~/tmp $ ar r libfunc.a func2.o
~/tmp $ ranlib libfunc.a
~/tmp $ nvcc -arch=sm_35 -rdc=true -o mainprog mainprog.cu -L . -l func
~/tmp $ ./mainprog 
results = 6 8

Here is the equivalent process on Windows, using the same source files as shown above. For the original build of the library and linking to the main app:

U:\tmp>nvcc -arch=sm_20 -rdc=true -o func1.obj -c func1.cu

U:\tmp>nvcc -arch=sm_20 -rdc=true -o func2.obj -c func2.cu

U:\tmp>lib /nologo /out:func.lib func1.obj func2.obj

U:\tmp>nvcc -arch=sm_20 -rdc=true -o mainprog mainprog.cu -L . -l func

U:\tmp>mainprog
results = 6 7

Now, I changed func2.cu, rebuilt it and refreshed the library, then linked to main program:

U:\tmp>nvcc -arch=sm_20 -rdc=true -o func2.obj -c func2.cu

U:\tmp>lib /nologo /out:func.lib func.lib func2.obj
Replacing func2.obj

U:\tmp>nvcc -arch=sm_20 -rdc=true -o mainprog mainprog.cu -L . -l func

U:\tmp>mainprog
results = 6 8

Thank you, njuffa. This solved my problem. It turned out that the *.lib cannot be created in VS2010 now, and it must be created by nvcc as you showed to me.

However, when I used the *.lib files made by nvcc to VS2010, some problems happened at the link stage. Eventually it seems that a *.lib file can contain either only host functions or only device functions, but cannot contain both in one lib file. The lib files for the host functions need to be included in C/C++ linker options, but those for the device functions need to be included in CUDA linker options. This way make my program built correctly.

Thank you. I appreciate you help.

njuffa’s approach solves an apparent dearth of Visual Studio 2010 to creating CUDA static libraries using its IDE and works perfectly fine.

Concerning this comment, as a further piece of information, I have successfully used njuffa’s procedure including in func1.cu a function with host device qualifier, a function with global qualifier and a function without qualifiers.

Hi,
I am stuck with the same problem.
I am generating a static library with a few classes that have methods that call call kernels included in the static library. I could compile the library succesfully using the -dc option and then linking all the objects with nvcc.
The problem occurs when I link a c++ application with this library, a link error tells that the classes exported in the static library aren’t found.
Example:

  1. Compile .cu files of the library:
    nvcc -arch=sm_30 …/reconGPU/src/CuMlemSinogram3d.cu …/reconGPU/src/CuSiddonProjector.cu …/reconGPU/src/CuProjector.cpp -dc -I…/recon/inc/ -I…/data/inc -I…/reconGPU/inc/ -I…/cmdGPU/inc/ -I/usr/local/cuda/include/ -I…/utils/inc/

  2. Linking the .o to generate the static lib (libreconGpu.a):
    nvcc -arch=sm_30 …/reconGPU/src/CuMlemSinogram3d.cu …/reconGPU/src/CuSiddonProjector.cu …/reconGPU/src/CuProjector.cpp -dc -I…/recon/inc/ -I…/data/inc -I…/reconGPU/inc/ -I…/cmdGPU/inc/ -I/usr/local/cuda/include/ -I…/utils/inc/

  3. Compiling and linking the application with the library:
    nvcc -arch=sm_30 …/reconGPU/src/CuMlemSinogram3d.cu …/reconGPU/src/CuSiddonProjector.cu …/reconGPU/src/CuProjector.cpp -dc -I…/recon/inc/ -I…/data/inc -I…/reconGPU/inc/ -I…/cmdGPU/inc/ -I/usr/local/cuda/include/ -I…/utils/inc/

I get the folllowing error:
./libreconGpu.a(CuMlemSinogram3d.o): In function __sti____cudaRegisterAll_52_tmpxft_00002788_00000000_11_CuMlemSinogram3d_cpp1_ii_50ed0107': /tmp/tmpxft_00002788_00000000-3_CuMlemSinogram3d.cudafe1.stub.c:2: undefined reference to __cudaRegisterLinkedBinary_52_tmpxft_00002788_00000000_11_CuMlemSinogram3d_cpp1_ii_50ed0107’
./libreconGpu.a(CuSiddonProjector.o): In function __sti____cudaRegisterAll_53_tmpxft_00002788_00000000_19_CuSiddonProjector_cpp1_ii_760d61c2': /tmp/tmpxft_00002788_00000000-8_CuSiddonProjector.cudafe1.stub.c:2: undefined reference to __cudaRegisterLinkedBinary_53_tmpxft_00002788_00000000_19_CuSiddonProjector_cpp1_ii_760d61c2’
collect2: ld returned 1 exit status

If I compile with the following modifications:
In 2) add -dlink flag and generate libreconGpu.o
In 3) Compile .o of the application:
nvcc -arch=sm_30 CuProjector.o CuMlemSinogram3d.o CuSiddonProjector.o -o libreconGpu.o -L./data -L./recon -ldata -lrecon -lib -dlink
4) Finally link everything using every .o:
nvcc *.o -o cuMLEM -L. -L./data -L./recon -L/usr/local/cuda/lib64 -I…/recon/inc/ -I…/data/inc -I…/reconGPU/inc/ -I…/cmdGPU/inc/ -I/usr/local/cuda/include/ -I…/utils/inc/ -lrecon -ldata -lreconGpu -lcudart

That is working. But I want to avoid recompiling with the .o files of the static library libreconGpu. It should be enough to link to the static library.

Any idea how can I solve this?

this bug has been fixed in our next release.
there is work around for CUDA 6.0 and CUDA 6.5.

  1. disable -dlink for staticlib and 2) add staticlib.lib as a dependency for the host linker in your Application.

Note that there is a bug CUDA 6.0.targets that prevents #1 from working. To fix, go to line 730 in the ComputeCudaLinkOutput target and change it from this:


To this:


victor

I still can’t seem to get this to work, even with the above target modification. I have 3 VS projects

device_functions - functions declared as device
my_kernel - uses functions from device_functions to make a kernel
main - main console app.

device_functions builds fine, but I get a link error when my_kernel builds - the linker can’t find the functions in device_functions.

I tried making device_functions an explicit dependency of my_kernel in the “CUDA Linker” property page, but that didn’t help either.

Any ideas?

Eventually I figured out the solution for this issue. I used VS2010 and CUDA 6.5. To build CUDA *.lib in VS2010 IDE, one must disable “CUDA linker”. This is how I did it: Project properties->CUDA linker->Linker Output (empty it – remove the default setting “(IntDir)(TargetName).device-link.obj”). This way CUDA linker can output nothing, and thus cannot mess up the inputs for VS2010 librarian. Here the bug for the CUDA integration with VS2010 is that the “CUDA linker” creates “(IntDir)(TargetName).device-link.obj” and uses it as an input to the VS2010 librarian, which does not require this obj file but only requires *.obj files created by -dc compilations from the source files.

Hope this is helpful for all people who still suffer from the problem and the CUDA production team to fix the bug in the new version CUDA.