Unresolved external symbol blockDim when liking a static NVCC compiled library

Hello,

I’m currently trying to compile a CUDA static library with NVCC, and link it to a google test executable.
I’m using Microsoft Visual Studio.

The static CUDA library compiles with no problem, however when I try to link it to my google test application I get the following compile error:

dic_cuda.lib(dic_cuda_generated_interpolator_cuda.cu.obj) : error LNK2001: unresolved external symbol __device_builtin_variable_blockDim

I tried including “device_launch_parameters.h” on both test file and cuda file, but that’s not the problem. I tried linking cudart.lib, cuda.lib to my static library but I doesn’t help either (I think they are used by default by the nvcc compiler).

Any idea why?

Thank you

EDIT:

Instead of creating a static library, I tried to compile a DLL, the error is now when compiling/linking my library, not my googletest executable.

Here’s the build log:

2>------ Rebuild All started: Project: dic_cuda, Configuration: Debug x64 ------
2>Building Custom Rule C:/Users/mrc/source/repos/repro_core/EPIC_dic/sources/CMakeLists.txt
2>Compiling CUDA source file ..\..\..\EPIC_dic\sources\interpolator_cuda.cu...
2>
2>C:\Users\mrc\source\repos\repro_core\build\EPIC_dic\sources>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc.exe" -gencode=arch=compute_30,code=\"sm_30,compute_30\" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.23.28105\bin\HostX64\x64" -x cu -rdc=true -IC:\Users\mrc\source\repos\repro_core\EPIC_dic\include\dic -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\include"     --keep-dir x64\Debug -maxrregcount=0  --machine 64 --compile -cudart shared -Xcompiler="/EHsc -Zi -Ob0" -g   -D_WINDOWS -D"CMAKE_INTDIR=\"Debug\"" -Ddic_cuda_EXPORTS -D"CMAKE_INTDIR=\"Debug\"" -Ddic_cuda_EXPORTS -D_WINDLL -D_MBCS -Xcompiler "/EHsc /W1 /nologo /Od /Fddic_cuda.dir\Debug\vc142.pdb /FS /Zi /RTC1 /MDd /GR" -o dic_cuda.dir\Debug\interpolator_cuda.obj "C:\Users\mrc\source\repos\repro_core\EPIC_dic\sources\interpolator_cuda.cu"
2>interpolator_cuda.cu
2>
2>C:\Users\mrc\source\repos\repro_core\build\EPIC_dic\sources>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\bin\nvcc.exe" -dlink -o dic_cuda.dir\Debug\dic_cuda.device-link.obj -Xcompiler "/EHsc /W1 /nologo /Od /Zi /RTC1 /MDd /GR" -L"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2\lib\x64" cuda.lib cudadevrt.lib cudart.lib kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib -Wno-deprecated-gpu-targets -gencode=arch=compute_30,code=sm_30  --machine 64 dic_cuda.dir\Debug\interpolator_cuda.obj
2>cuda.lib
2>cudadevrt.lib
2>cudart.lib
2>kernel32.lib
2>user32.lib
2>gdi32.lib
2>winspool.lib
2>shell32.lib
2>ole32.lib
2>oleaut32.lib
2>uuid.lib
2>comdlg32.lib
2>advapi32.lib
2>interpolator_cuda.obj
2>   Creating library C:/Users/mrc/source/repos/repro_core/build/EPIC_dic/sources/Debug/dic_cuda.lib and object C:/Users/mrc/source/repos/repro_core/build/EPIC_dic/sources/Debug/dic_cuda.exp
2>interpolator_cuda.obj : error LNK2001: unresolved external symbol __device_builtin_variable_blockDim
2>C:\Users\mrc\source\repos\repro_core\build\EPIC_dic\sources\Debug\dic_cuda.dll : fatal error LNK1120: 1 unresolved externals
2>Done building project "dic_cuda.vcxproj" -- FAILED.
========== Rebuild All: 1 succeeded, 1 failed, 0 skipped ==========

I eventually found the problem…

It was a typo in this wrapper around my global method:

void INTERP_CUDA::init_eye_matrix_wrapper(double* ptr, int cols, int rows)
{
if (cols != rows) {
throw “Matrix must have square dimensions”;
}

cudaDeviceProp properties;
cudaGetDeviceProperties(&properties, 0);

int maxThreads = properties.maxThreadsPerBlock;
int maxBlockCols = properties.maxThreadsDim[0];
int maxBlockRows = properties.maxThreadsDim[1];

if (cols > maxBlockCols || rows > maxBlockRows || rows*cols > maxThreads) {
throw “Matrix is too large for this CUDA kernel.”;
}

dim3 glockDim(cols, rows); <= here glockDim instead of blockDim
dim3 gridDim(1, 1);

init_eye_matrix << < gridDim, blockDim >> > (ptr, cols, rows);
}

The reported error by the compiler led me to the wrong way. IntelliSense does’t understand <<< , >>> instructions so it couldn’t show me where is the error.