Problem accessing global memory General protection fault

Hello there.

I encountered a problem when I try to access a variable located in device memory. Here is the situation basically:

// ------ CODE -------

struct mystruct
{
int dummy;
};

#define ARRAY_SIZE 16

device struct mystruct* myarray;

void test()
{
// dynamically allocate the array of structs
cudaMalloc((void**)&myarray, ARRAY_SIZE*sizeof(struct mystruct));

  // this triggers an exception ?!?!?
  myarray[0].dummy = 1;

}

// ------- END CODE -------

An important thing to mention is that this happens only when running on the device. The code runs OK in emulation mode.
My GPU is 8600 GTS.

Anyone had the same problem? I appreciate any help!

Peter

cudaMalloc is an API call for host code. It allocs device mem and returns the start address. There is no need for declaring the device mem then.

The pointer to the pointer variable you pass into cudaMalloc therefore should be a pointer variable on the host, not on the device. (it sometimes works in emulation as everything is on the host then)

For accessing the mem, you can pass this pointer to the kernel as parameter.

Peter

Thanks for the quick response, but now there is another problem. Suppose the struct is defined as

struct mystruct

{

float* dynarray;

};

and I need to allocate dynamically the array of structs first, and then allocate an array of floats for each member of that array? I still get the unhandled exception…

ANY memory allocated by cudaMalloc is DEVICE memory. You cannot dereference such a pointer anywhere except in a kernel. If you need to allocate a pointer inside an allocated structure, you are going to need to make a “mirror” structure on alocated on the host (with normal malloc/new/whatever), then cudaMalloc all of the pointers inside and cudaMemcpy the mirror structure to device memory.

As MisterAnderson42 emphasized, you cannot dereference a device mem pointer on the host. So you probably need to “invert” you structures, ie. turn an array of structs into a struct of arrays. Then you alloc all the arrays, put their start pointers into the struct and upload the struct.

Peter