Hi all,
I am having a problem of mapping gridIdx, blockIdx and array index. I have an array of size x which values will be filled in from after the kernel is run. When I define it in the host, I initialize each values in the array with 1. I print them out after copy the array back to the host, the values are still 1. I am thinking it may be the problem of mapping. Because I am not sure if I am doing right in trasfterring the blockIdx to array index. Could you please take a look at my code and let me know where the problem is? (I am using a 1D block and 1D thread.)
Host
   ...
   int x = 6;
   int* d = (int*)malloc(sizeof(int)*x);
   for(j=0; j<x; j++){
d[j] = 1;
   }  Â
   int* dD = (int*)malloc(sizeof(int)*size);
   CUDA_SAFE_CALL(cudaMalloc((void**)&dD,size*sizeof(int)));
   CUDA_SAFE_CALL(cudaMemcpy(dD,d,size,cudaMemcpyHostToDevice));
   ...
   myKernel<<<dimGrid,dimBlock>>>(par1,par2,par3,...,d);
  Â
   CUDA_SAFE_CALL(cudaMemcpy(d,dD,x,cudaMemcpyDeviceToHost));
   for(j=0;j<x;j++){
         printf("\n  d[%d]=%d",j,d[j]);
   }
   ...
Kernel…
   __global__ void myKernel(...,int* d){          Â
       int id = blockIdx.x*bz+threadIdx.x;   Â
       //It may have problems in the id. I would like to fill the values from d[0], d[1], ..... Since I have allocated memory for d in global memory, will the following be written to the global memory?
       d[id] = id;
   }
  Â
Appreciate!!