struct differs between emu and device

Hello forum.
The last days I have started playing with cuda and find it really nice.
But when i use structs there seems to be a difference between emulation mode and the device (8500GT)

I have the following struct:

struct test_tt {
long test;
t_matrix22 matrix;
float lambda;
t_complex n_sub;
float sin_phi;
bool pol_s;

#define t_complex float2
struct t_matrix22 {
t_complex a;
t_complex b;
t_complex c;
t_complex d;

Nothing special so far.
I take a test_tt and fill it with data, load it into the device and can also download it without problems.
All data stays intact. (emu + device)

When the kernel accesses the struct, it looks as if data fields were shifted,
e.g. matrix.a.x = 2 writes to matrix.a.y as seen from the host.
The kernel itself reads the “right” value.

The const pointer to test_tt is right, I can read out the right value of other fields.
Even more strange:

When I change the order of the fields in test_tt - it works and in another order it does not !?

In emulation it works fine, so it is hard to debug.

Could anybody please give me a hint what could be the cause for that?



Which OS/compiler?
Compiler can insert padding or decide to align on different boundaries.

If it is Linux or MacosX, try the -malign-double flag.

From the release notes:

When compiling GCC, special care must be taken for structs that
contain 64-bit integers. This is because GCC aligns long longs
to a 4 byte boundary by default, while NVCC aligns long longs
to an 8 byte boundary by default. Thus, when using GCC to
compile a file that has a struct/union, users must give the
option to GCC. When using NVCC, this option is automatically
passed to GCC.

It is Linux, no 64bit integers and I also use nvcc…
I still wonder why it works in emulation mode and why it depends on the order inside the struct…

The struct is 64 bytes on the device and 60 bytes on the host.
I do not use doubles - does nvcc handle float2 like one?
Until now -m-no-align-double did not help…

Note for those with the same problem:

writing align(32) before every struct helped.
I would be glad if somebody could tell, why. gcc/g++ with -malign-double did nothing.