I thought this was a pack issue, but #pragma pack(show) tells me that pragma size is for c++ the default (8) (cuda does not support this feature). So I’m a little bit lost, and am not sure how to solve the problem. Can anybody help?
Apparantly the built-in types are defined to be aligned on a 8 byte boundary. The problem is that this is done only for the CUDA compiler.
An example type:
/*DEVICE_BUILTIN*/
struct __builtin_align__(8) uint2
{
unsigned int x, y;
};
The builtin_align is defined for the CUDA compiler, but not for the GCC/MSVC compiler, as a result the data is most likely aligned differently. So when you copy a structure containing this type from Visual C++ to CUDA, data can be misaligned (depending on the layout of the structure).
For me, the solution was to modify the structure definition to:
My suggestion is to at least mention this in the CUDA SDK Sample for the CPP-integration project, which shows how simple it is to use built-in types in C++ code.