I am using OpenCV 3.1 with VS2012 C++/CLI on a Win10 machine with CUDA 7.5 and NSight 4.7
I am hoping that I will not get flamed for this question but it seems like something that might have more general applicability.
I have a problem that will require a custom kernel interfacing with my base OpenCV code and this is my first time trying to write that. I have been searching for anything that is current and shows all the include files and namespace headings for both the .cpp and .cu source files and header files.
So far, I have the following for the .cu file:
#include <cuda.h>
#include <cuda_runtime.h>
#include <stdlib.h>
using namespace std;
//using namespace cv;
__global__ void initialize_labels(void)
{
unsigned int gid = threadIdx.x + blockIdx.x * blockDim.x;
}
extern "C"
{
void tinit(void){
dim3 blDim(32,8);
dim3 grDim(10,10);
// std::cout << "calling kernel from func\n";
initialize_labels<<<grDim,blDim>>>(void);
// std::cout << "done with kernel call\n";
cudaDeviceSynchronize();
}
}
However, when I try and compile this it appears that threadIdx, blokcIdx, etc. are unrecognized as is the kernel call itself because it balks at the “<<<” call.
Here is what the compiler prints out:
1>------ Rebuild All started: Project: SV11, Configuration: Debug x64 ------
1>
1> C:\Users\Administrator\Documents\Visual Studio 2012\Projects\SV11\SV11>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -G --keep-dir x64\Debug -maxrregcount=0 --machine 64 --compile -g -DWIN32 -D_DEBUG -D_UNICODE -DUNICODE -Xcompiler "/EHsc /W2 /nologo /Od /Zi /MDd " -o x64\Debug\gpuMaxEnclosedCircle.cu.obj "C:\Users\Administrator\Documents\Visual Studio 2012\Projects\SV11\SV11\gpuMaxEnclosedCircle.cu" -clean
1> gpuMaxEnclosedCircle.cu
1> Compiling CUDA source file gpuMaxEnclosedCircle.cu...
1>
1> C:\Users\Administrator\Documents\Visual Studio 2012\Projects\SV11\SV11>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2012 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -G --keep-dir x64\Debug -maxrregcount=0 --machine 64 --compile -cudart static -g -DWIN32 -D_DEBUG -D_UNICODE -DUNICODE -Xcompiler "/EHsc /W2 /nologo /Od /Zi /MDd " -o x64\Debug\gpuMaxEnclosedCircle.cu.obj "C:\Users\Administrator\Documents\Visual Studio 2012\Projects\SV11\SV11\gpuMaxEnclosedCircle.cu"
1>C:/Users/Administrator/Documents/Visual Studio 2012/Projects/SV11/SV11/gpuMaxEnclosedCircle.cu(15): warning : variable "gid" was declared but never referenced
1>C:/Users/Administrator/Documents/Visual Studio 2012/Projects/SV11/SV11/gpuMaxEnclosedCircle.cu(24): error : type name is not allowed
1>C:/Users/Administrator/Documents/Visual Studio 2012/Projects/SV11/SV11/gpuMaxEnclosedCircle.cu(24): error : too many arguments in function call
1> gpuMaxEnclosedCircle.cu
1>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\V110\BuildCustomizations\CUDA 7.5.targets(604,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\bin\nvcc.exe" -gencode=arch=compute_20,code=\"sm_20,compute_20\" --use-local-env --cl-version 2012 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_amd64" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include" -G --keep-dir x64\Debug -maxrregcount=0 --machine 64 --compile -cudart static -g -DWIN32 -D_DEBUG -D_UNICODE -DUNICODE -Xcompiler "/EHsc /W2 /nologo /Od /Zi /MDd " -o x64\Debug\gpuMaxEnclosedCircle.cu.obj "C:\Users\Administrator\Documents\Visual Studio 2012\Projects\SV11\SV11\gpuMaxEnclosedCircle.cu"" exited with code 2.
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========
As far as I can tell, NSight is properly installed.
Any help in figuring out if this is a CUDA install issue of something else is appreciated.