I tried to run some simple CUDA scripts which I compiled successfully. It runs without any warnings or errors but nothing happens.
The script below adds a constant to each array element. At the end the new values are printed. But nothing happened. It still prints the numbers 0-99 instead of those numbers increased by 10.
__global__ void kernelTest(int* i, int length){
unsigned int tid = blockIdx.x*blockDim.x + threadIdx.x;
if(tid < length)
i[tid] = i[tid] + 10;
}
/*
* This is the main routine which declares and initializes the integer vector, moves it to the device, launches kernel
* brings the result vector back to host and dumps it on the console.
*/
int main(){
int length = 100;
int* i = (int*)malloc(length*sizeof(int));
for(int x=0;x<length;x++)
i[x] = x;
int* i_d;
cudaMalloc((void**)&i_d,length*sizeof(int));
cudaMemcpy(i_d, i, length*sizeof(int), cudaMemcpyHostToDevice);
dim3 threads; threads.x = 32;
dim3 blocks; blocks.x = (length/threads.x) + 1;
kernelTest<<<threads,blocks>>>(i_d,length);
cudaMemcpy(i, i_d, length*sizeof(int), cudaMemcpyDeviceToHost);
for(int x=0;x<length;x++)
printf("%d\t",i[x]);
std::cin;
}
Probably invalid execution parameters. blocks and threads contain uninitialized values. While not likely to cause an error, blocks and threads are reversed in the kernel call. If you add some error checking in the code, all of this would become immediately obvious.
I’m sorry, I’m not used to programming C. I was under the impression that the program would crash, if there were some memory violations / uninitalized variables. How can I add error checking?
directly after the kernel call you should see the error message reported (probably “Launch failure”). In your code, both blocks and threads are dim3 structures, but you are never setting the values of blocks.y, blocks.z, threads.y, and threads.z. That will be the likely source of the error.
Thank you very much. Was looking for these functions. You both were right. I should have used the appropriate functions to initalize the kernel parameters. Unfortunately there are many bad examples in the web. I will now stick to the official documentation.