Hello. I am just beginning to learn CUDA and am having some difficulty getting my projects to compile. I am using VS2010 on Win7 64-bit. I have gotten several test applications running and what I am working on now is getting a simple CUDA kernel to be called from a .cpp.
However, when I compile my program it gives me the following error: (shown below)
[b]1> main.cpp
1>c:\users\santos\documents\visual studio 2010\projects\cuda_opengl\cuda_opengl\test_kernel.cu(6): error C2065: ‘blockIdx’ : undeclared identifier
1>c:\users\santos\documents\visual studio 2010\projects\cuda_opengl\cuda_opengl\test_kernel.cu(6): error C2228: left of ‘.x’ must have class/struct/union
1> type is ‘‘unknown-type’’
1>c:\users\santos\documents\visual studio 2010\projects\cuda_opengl\cuda_opengl\main.cpp(33): error C2059: syntax error : ‘<’
1>
1>Build FAILED.[/b]
Anyway, I’m going to keep working on it, but if you have any insight into what my problem is I would greatly appreciate it. It seems as though every time I make a post in a forum about a problem I’m having I solve it on my own shortly after. Hopefully, that will be the case with this as well. Thanks for your time.
Entire error message:
1>------ Rebuild All started: Project: CUDA_OpenGL, Configuration: Debug Win32 ------
1>Build started 3/22/2011 5:07:28 PM.
1>_PrepareForClean:
1> Deleting file "Debug\CUDA_OpenGL.lastbuildstate".
1>CudaClean:
1>
1> C:\Users\Santos\Documents\Visual Studio 2010\Projects\CUDA_OpenGL\CUDA_OpenGL>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\\bin\nvcc.exe" -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\\include" -G0 --keep-dir "Debug\\" -maxrregcount=32 --machine 32 --compile -D_NEXUS_DEBUG -g -Xcompiler "/EHsc /nologo /Od /Zi /MDd " -o "Debug\test_kernel.obj" "C:\Users\Santos\Documents\Visual Studio 2010\Projects\CUDA_OpenGL\CUDA_OpenGL\test_kernel.cu" -clean
1> Deleting file "Debug\test_kernel.cu.deps".
1>InitializeBuildStatus:
1> Touching "Debug\CUDA_OpenGL.unsuccessfulbuild".
1>CudaBuild:
1> Compiling CUDA source file test_kernel.cu...
1>
1> C:\Users\Santos\Documents\Visual Studio 2010\Projects\CUDA_OpenGL\CUDA_OpenGL>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\\bin\nvcc.exe" -gencode=arch=compute_10,code=\"sm_10,compute_10\" --use-local-env --cl-version 2008 -ccbin "C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\\include" -G0 --keep-dir "Debug\\" -maxrregcount=32 --machine 32 --compile -D_NEXUS_DEBUG -g -Xcompiler "/EHsc /nologo /Od /Zi /MDd " -o "Debug\test_kernel.obj" "C:\Users\Santos\Documents\Visual Studio 2010\Projects\CUDA_OpenGL\CUDA_OpenGL\test_kernel.cu"
1> test_kernel.cu
1> tmpxft_000006a4_00000000-0_test_kernel.cudafe1.gpu
1> tmpxft_000006a4_00000000-5_test_kernel.cudafe2.gpu
1> test_kernel.cu
1> tmpxft_000006a4_00000000-0_test_kernel.cudafe1.cpp
1> tmpxft_000006a4_00000000-11_test_kernel.ii
1> Deleting file "tmpxft_000006a4_00000000-6_test_kernel.cpp3.o".
1>ClCompile:
1> main.cpp
1>c:\users\santos\documents\visual studio 2010\projects\cuda_opengl\cuda_opengl\test_kernel.cu(6): error C2065: 'blockIdx' : undeclared identifier
1>c:\users\santos\documents\visual studio 2010\projects\cuda_opengl\cuda_opengl\test_kernel.cu(6): error C2228: left of '.x' must have class/struct/union
1> type is ''unknown-type''
1>c:\users\santos\documents\visual studio 2010\projects\cuda_opengl\cuda_opengl\main.cpp(33): error C2059: syntax error : '<'
1>
1>Build FAILED.
1>
1>Time Elapsed 00:00:01.53
========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========
This is my main.cpp which calls the kernel.
#include <cuda.h>
#include "cuda_runtime.h"
#include <stdio.h>
#include <iostream>
#include "test_kernel.cu"
#define N 10
int main(){
cudaDeviceProp prop;
int count;
int a[N], b[N], c[N];
int *dev_a, *dev_b, *dev_c;
// allocate the memory on the GPU
cudaMalloc( (void**)&dev_a, N * sizeof(int) );
cudaMalloc( (void**)&dev_b, N * sizeof(int) );
cudaMalloc( (void**)&dev_c, N * sizeof(int) );
// fill the arrays 'a' and 'b' on the CPU
for (int i=0; i<N; i++) {
a[i] = i;
b[i] = i;
}
// copy the arrays 'a' and 'b' to the GPU
cudaMemcpy( dev_a, a, N * sizeof(int), cudaMemcpyHostToDevice );
cudaMemcpy( dev_b, b, N * sizeof(int), cudaMemcpyHostToDevice );
add<<<N,1>>>( dev_a, dev_b, dev_c );
// copy the array 'c' back from the GPU to the CPU
cudaMemcpy( c, dev_c, N * sizeof(int), cudaMemcpyDeviceToHost );
// display the results
for (int i=0; i<N; i++) {
printf( "%d + %d = %d\n", a[i], b[i], c[i] );
}
std::cin.get();
// free the memory allocated on the GPU
cudaFree( dev_a );
cudaFree( dev_b );
cudaFree( dev_c );
return 0;
}
Finally, here is my kernel. “test_kernel.cu”
__global__ void add(int *a, int *b, int *c)
{
int tid = blockIdx.x;
c[tid] = a[tid] + b[tid];
}