Hi,
I’m developing certain very big program using CUDA.
My program requires a lot of memories (for matrix computations). So I’ve allocated global memories for them and passed their addresses into kernels.
However, the number of parameters are many, so I grouped them into a struct.
Part of my code is below:
struct GroupParameter {
float *a;
float *b;
...
};
/////////////////////////////////
main() {
...
GroupParameter *param;
cudaMalloc((void**)¶m, sizeof(GroupParameter));
cudaMalloc((void**)¶m->a, sizeof(float) * 100); // (1)
cudaMalloc((void**)¶m->b, sizeof(float) * 100);
mykernel<<<dimGrid, dimBlock>>> (param);
}
/////////////////////////////////
__global__ mykernel (GroupParameter *param)
{
//(param->a) == 0 [my problem here] (2)
}
When I printed “param->a” right after the allocation in host, I might get proper address of .
However, I always get 0 when I somehow print the value of “param->a” in myKernel (I stored the value in another parameter to access in host, then I printed it in host).
What is the problem?
There are two suspicious things:
-
“param->a” in host function (1) is possible? Because “param” points to certain global memory. As far as I know, the host function cannot access global memory.
-
“param->a” in kernel (2) is possible? Because, I heard that although I allocated global memories to the “param”, kernels’ parameters are stored in shared memory.
So, does anybody know how I can access the global memories ( *a and *b ) properly through *(param->a), *(param->b) in kernel?