Qus: BlockIdx and array index mapping

Hi all,

I am having a problem of mapping gridIdx, blockIdx and array index. I have an array of size x which values will be filled in from after the kernel is run. When I define it in the host, I initialize each values in the array with 1. I print them out after copy the array back to the host, the values are still 1. I am thinking it may be the problem of mapping. Because I am not sure if I am doing right in trasfterring the blockIdx to array index. Could you please take a look at my code and let me know where the problem is? (I am using a 1D block and 1D thread.)

Host

   ...

 Â Â Â int x = 6;

 Â Â Â int* d = (int*)malloc(sizeof(int)*x);

 Â Â Â for(j=0; j<x; j++){

	d[j] = 1;

 Â Â Â } Â Â 

 Â Â Â int* dD = (int*)malloc(sizeof(int)*size);

 Â Â Â CUDA_SAFE_CALL(cudaMalloc((void**)&dD,size*sizeof(int)));

 Â Â Â CUDA_SAFE_CALL(cudaMemcpy(dD,d,size,cudaMemcpyHostToDevice));

 Â Â Â ...

 Â Â Â myKernel<<<dimGrid,dimBlock>>>(par1,par2,par3,...,d);

 Â Â Â 

 Â Â Â CUDA_SAFE_CALL(cudaMemcpy(d,dD,x,cudaMemcpyDeviceToHost));

 Â Â Â for(j=0;j<x;j++){

          printf("\n  d[%d]=%d",j,d[j]);

 Â Â Â }

 Â Â Â ...

Kernel…

   __global__ void myKernel(...,int* d){           

 Â Â Â Â Â Â Â int id = blockIdx.x*bz+threadIdx.x; Â Â Â 

 Â Â Â Â Â Â Â //It may have problems in the id. I would like to fill the values from d[0], d[1], ..... Since I have allocated memory for d in global memory, will the following be written to the global memory?

 Â Â Â Â Â Â Â d[id] = id;

 Â Â Â }

 Â Â Â

Appreciate!!

int* dD = (int*)malloc(sizeof(int)*size); should be int *dD;

int id = blockIdx.x*bz+threadIdx.x; should be
int id = threadIdx.x + __mul24(blockIdx.x, blockDim.x); (I guess you have bz = blockDim.x)

CUDA_SAFE_CALL(cudaMemcpy(d,dD,x,cudaMemcpyDeviceToHost)); needs to be
CUDA_SAFE_CALL(cudaMemcpy(d,dD,sizeof(int)*size, cudaMemcpyDeviceToHost));

You were only copying the first 6 bytes (one and a half int)

Thx for the reply. But it is still not working. It seems like the kerenl was not even executed. But there is no compile error.

did you compile in debug mode? then the CUDA_SAFE_CALL will tell you if the kernel exited with an error.