Hello forum.
The last days I have started playing with cuda and find it really nice.
But when i use structs there seems to be a difference between emulation mode and the device (8500GT)
I have the following struct:
struct test_tt {
long test;
t_matrix22 matrix;
float lambda;
t_complex n_sub;
float sin_phi;
bool pol_s;
};
where
#define t_complex float2
struct t_matrix22 {
t_complex a;
t_complex b;
t_complex c;
t_complex d;
};
Nothing special so far.
I take a test_tt and fill it with data, load it into the device and can also download it without problems.
All data stays intact. (emu + device)
When the kernel accesses the struct, it looks as if data fields were shifted,
e.g. matrix.a.x = 2 writes to matrix.a.y as seen from the host.
The kernel itself reads the “right” value.
The const pointer to test_tt is right, I can read out the right value of other fields.
Even more strange:
When I change the order of the fields in test_tt - it works and in another order it does not !?
In emulation it works fine, so it is hard to debug.
Could anybody please give me a hint what could be the cause for that?
Thanks,
Martin
Which OS/compiler?
Compiler can insert padding or decide to align on different boundaries.
If it is Linux or MacosX, try the -malign-double flag.
From the release notes:
When compiling GCC, special care must be taken for structs that
contain 64-bit integers. This is because GCC aligns long longs
to a 4 byte boundary by default, while NVCC aligns long longs
to an 8 byte boundary by default. Thus, when using GCC to
compile a file that has a struct/union, users must give the
-malign-double
option to GCC. When using NVCC, this option is automatically
passed to GCC.
It is Linux, no 64bit integers and I also use nvcc…
I still wonder why it works in emulation mode and why it depends on the order inside the struct…
[url=“http://forums.nvidia.com/index.php?act=ST&f=75&t=66946”]http://forums.nvidia.com/index.php?act=ST&f=75&t=66946[/url]
helped.
The struct is 64 bytes on the device and 60 bytes on the host.
I do not use doubles - does nvcc handle float2 like one?
Until now -m-no-align-double did not help…
Note for those with the same problem:
writing align(32) before every struct helped.
I would be glad if somebody could tell, why. gcc/g++ with -malign-double did nothing.