struct differs between emu and device

Hello forum.
The last days I have started playing with cuda and find it really nice.
But when i use structs there seems to be a difference between emulation mode and the device (8500GT)

I have the following struct:

struct test_tt {
long test;
t_matrix22 matrix;
float lambda;
t_complex n_sub;
float sin_phi;
bool pol_s;
};

where
#define t_complex float2
struct t_matrix22 {
t_complex a;
t_complex b;
t_complex c;
t_complex d;
};

Nothing special so far.
I take a test_tt and fill it with data, load it into the device and can also download it without problems.
All data stays intact. (emu + device)

When the kernel accesses the struct, it looks as if data fields were shifted,
e.g. matrix.a.x = 2 writes to matrix.a.y as seen from the host.
The kernel itself reads the “right” value.

The const pointer to test_tt is right, I can read out the right value of other fields.
Even more strange:

When I change the order of the fields in test_tt - it works and in another order it does not !?

In emulation it works fine, so it is hard to debug.

Could anybody please give me a hint what could be the cause for that?

Thanks,

Martin

Which OS/compiler?
Compiler can insert padding or decide to align on different boundaries.

If it is Linux or MacosX, try the -malign-double flag.

From the release notes:

When compiling GCC, special care must be taken for structs that
contain 64-bit integers. This is because GCC aligns long longs
to a 4 byte boundary by default, while NVCC aligns long longs
to an 8 byte boundary by default. Thus, when using GCC to
compile a file that has a struct/union, users must give the
-malign-double
option to GCC. When using NVCC, this option is automatically
passed to GCC.

It is Linux, no 64bit integers and I also use nvcc…
I still wonder why it works in emulation mode and why it depends on the order inside the struct…

http://forums.nvidia.com/index.php?act=ST&f=75&t=66946
helped.

The struct is 64 bytes on the device and 60 bytes on the host.
I do not use doubles - does nvcc handle float2 like one?
Until now -m-no-align-double did not help…

Note for those with the same problem:

writing align(32) before every struct helped.
I would be glad if somebody could tell, why. gcc/g++ with -malign-double did nothing.