I am experimenting with porting some existing code (x264 encoder) using CUDA. The issue I am running into is that of the mechanics for data alignment in NVCC.
The existing x264 code base has declarations designed to force data alignment within a struct declaration. It looks like this…
/* Current MB DCT coeffs */
struct
{
DECLARE_ALIGNED( int, luma16x16_dc[16], 16 );
DECLARE_ALIGNED( int, chroma_dc[2][4], 16 );
// FIXME merge with union
DECLARE_ALIGNED( int, luma8x8[4][64], 16 );
union
{
DECLARE_ALIGNED( int, residual_ac[15], 16 );
DECLARE_ALIGNED( int, luma4x4[16], 16 );
} block[16+8];
} dct;
Where the DECLARE_ALIGNED macro is declared elsewhere as
# define DECLARE_ALIGNED( type, var, n ) type var __attribute__((aligned(n)))
This above example fails to compile using the 0.8 version of nvcc on RHEL 4.3
In the CUDA Progrmaming Guide Section 6.1.2.1, the align specification is stated to apply to structs. How do I force alignment of members of a struct?
Regards,
Spencer