struct of pointers segmenatiion fault

Hi all,

perhaps whtat I try to do seems a bit strange, but it was an idea to reduce the amount of paramenters which I pass to my kernels.

So the idea is that I make a struct of all the pointers which point to my array which I will use on the GPU. This struct I want to pass to the kernel. So what I did:

typedef struct

{

    xy_struct* xy_coords; //pointer to another struct

    z_struct* z_coords;  //pointer to another struct

    int* b_i;

    float* pre_sum;

    int* BIN_events;

}ptr_struct_d;
void MLEM(){ 

      ptr_struct_d ptr_array_d;

      xy_struct* xy_coords_d;

      ptr_array_d.xy_coords = xy_coords_d;

      int size_xy= 192*192;

      CUDA_SAFE_CALL(cudaMalloc((void **) &ptr_array_d.xy_coords, sizeof(xy_struct)*size_xy));

      ....

      //doing it in the same way for the other arrays

     Sinogram<<<numBlocks, numThreadsPerBlock>>>(ptr_array_d.xy_coords); // in this kernel the  array xy_coords is filled, just passing one pointer works

     pre_sum_calc<<<numBlocks, numThreadsPerBlock>>>(ptr_array_d, N, n_events); //here I want to pass the whole struct with all pointers, however I get a segmentation fault

my kernels:

__global__ void Sinogram(xy_struct* xy_coords){

     //filling of the array...   

 } 

__global__ void pre_sum_calc(ptr_struct_d ptr_array, int N, int n_events)

{

   int* BIN_events = ptr_array.BIN_events; //I would like to use this array, later another one and so on

   int BIN = BIN_events[tID];

}  

So the first kernel, where I just pass one pointer starts and give results, if I want to pass the whole struct, it doesn*t work. Do I have to pass a pointer to the struct and make a cudaMemcpy before. Or is there another problem in my code?

BTW emulation mode runs fine with right results, compilation of device release doesn’t give any warning regarding the pointers. Error which is given is a segmentation fault.

Perhaps this idea seems for a experienced programer senseless, so any comment is welcome!

for this to work you …

must allocate device memory

must perform a cudamemcpy

must pass a pointer to device memory to the kernel

overall this takes longer than to just pass many arguments, believe me ;)

Thanks, so I will go back to the normal way and just pass my parameters. At least then I know that it works.

But just one question more… the reason that it doesn’t work is that you can’t pass a structre as paramenter?

Well, I think it should either work or the compiler should fail. If it crashes at runtime and your code is not at fault I think it is a bug in the compiler/runtime.

I admit I never tried such thing myself, it always seemed too risky (also due to potential alignment issues, though there should not be any with your example).