array of structure problem

Hi,

I have problems with an array of structures. Maybe it’s because of alignment problems, but I don’t understand the CUDA manual about that… I hope some clever people can help me. I have a structure:

[codebox]struct DBDStruct

{

double	Temperature;

double 	DensN;

double	DensN2A;	

};

[/codebox]

and fill the structure on CPU memory with some initial values, copy that to GPU memory, calculate on GPU and copy it back. The whole code runs with “float” values, but with doubles I get problems. I added the “-arch=sm_13” to the compiler flags ( i have a GTX 285).

My memory allocation on CPU is:

[codebox] struct DBDStruct* oldGridValues = (struct DBDStruct*) malloc(Number_of_gridpoints_xNumber_of_gridpoints_ysizeof(

DBDStruct));[/codebox]

On GPU it is:

[codebox]struct DBDStruct* cuda_oldGridValues;

CUDA_SAFE_CALL(cudaMalloc((void**)&cuda_oldGridValues, Number_of_gridpoints_x*Number_of_gridpoints_y * sizeof(DBDStruct)));[/codebox]

and I copy the stuff to the GPU with:

[codebox]CUDA_SAFE_CALL(cudaMemcpy(cuda_oldGridValues, oldGridValues, Number_of_gridpoints_x*Number_of_gridpoints_y * sizeof(DBDStruct), cudaMemcpyHostToDevice));[/codebox]

copy back to CPU mem with:

[codebox] CUDA_SAFE_CALL(cudaMemcpy(newGridValues, cuda_newGridValues, Number_of_gridpoints_x*Number_of_gridpoints_y * sizeof(DBDStruct), cudaMemcpyDeviceToHost));[/codebox]

What is the easiest way to get it running with doubles?

Thx for any help!!!

Philipp.

What kind of “problems” do you have? ;)

Ohh, well :rolleyes: (sorry) I get garbage results, some points are not initialized in GPU mem and some points seem to appear at two places when copied back to CPU mem. Thats why I thought it is a problem of allignment…

No prob.

Have you tried to debug the application in device emulation mode, so you can enter in kernel lines and watch how structs are copied?

No, I didn’t. I never tryed that and don’t really know how to do… But don’t I get problems with structs larger than 16 bytes anyways, when I use the abovementioned lines for memory allocation ?, because on the device there are some kind of “spaces” in the arrays ? I really don’t understand the manual in that point…

I had the same problem in another simulation, when I had a structure with more than 4 floats…, with 4 floats or less (=16Byte) it runs pretty well…

Ok, I’m in accord with you, maybe there is an alignment problem.

I’m sorry but I’ve never used array of structures and so I’ve never used alignment in mem copy. The only solution that I can suggest is to try using cudaMallocArray() and cudaMemcpyArrayToArray() to solve your problem, but I’ve never used them so I cannot help you in understanding the paramenters you need to pass at these functions. :">

EDIT: Look at CUDA reference manual for functions signatures ;)

Anyone who knows what the problem is? :(

Did you try to lower the dimensions of your blocks? It might not be a problem with copying your arrays back and forth, but doing the operations with them. What happens when you simply assign a constant to the output array and copy it back (commenting out everything else).

I had a similar problem when I used too many registers in a block with doubles, then I got garbage in the results. Making the block smaller somehow helped. As far as the alignment goes, it is useful for faster copies in and out of global memory, but shouldn’t matter otherwise.

Cheers