The v4.0 CUDA compiler on Windows gives an error message for this code in the syntax for a kernel call. However, the real problem was the use of an undeclared identifier (mem_size_A) in a template. It would be better if the CUDA compiler would give an error message that would identify undeclared identifiers instead of some cascaded issue.
Unfortunately, the MSVC compiler does not complain about it either, and compiles successfully if the CUDA syntax is removed (i.e., <<<>>> and global deleted). MSVC only complains if bar() is used. It took me a while to narrow it down.
Does anyone know if there is there a compiler (nvcc or msvc) switch to do a more complete error check?
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\bin\nvcc.exe” -gencode=arch=compute_10,code="sm_10,compute_10" --use-local-env --cl-version 2010 -ccbin “c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\bin” -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v4.0\include" -G0 --keep-dir “Debug” -maxrregcount=0 --machine 32 --compile -D_NEXUS_DEBUG -g -Xcompiler "/EHsc /nologo /Od /Zi /MDd " -o “Debug\main.cu.obj” “…\main.cu”
visual studio 2010/Projects/…/main.cu(19): error C2059: syntax error : ‘<’
#include <stdio.h>
#include <stdlib.h>
#include <tchar.h>
template <class T>
class Matrix;
template <class T>
__global__ void kern()
{
};
template <class T>
class Matrix
{
public:
static bool mul()
{
kern<T><<< 1, 1 >>>();
return 0;
};
static void foo(int xxx)
{
}
static void bar()
{
foo(mem_size_A);
}
};
int _tmain(int argc, _TCHAR* argv[])
{
#define BASETYPE float
Matrix<BASETYPE>::mul();
return 0;
}