Is there a way to hint regarding an input pointer alignment property to nvcc?

I have a problem when loading uchar4 for example from a char *, something like:

char * buf;
uchar4 val = *(uchar4 *)&buf[threadIdx.x];

It seems that CUDA won’t assume that buf is aligned to 4 bytes and breaks the load down to 4 different loads, each one of one byte. The only way to fix this is to load as an integer, but it doesn’t work with larger sizes (such as int2 for example).

Is there a way explicitly tell the compiler that a give pointer is assured to be aligned to some boundary?

Interestingly it seems that OpenCL does assume that (probably as with OpenCL you have to pass a clmem object rather than a pointer).


there is an align(x) qualifier, x = 8 or 16 in the docs (guess it’ll take a 4), which is used on structs. so if you wrapped your array in a struct with that qualifier, maybe that would do it. e.g. struct align(4){char *buf;} mystruct. Something like that.

I would also try, more simply,
char[4] *buf;
uchar4 val = (uchar4) buf[threadIdx.x];

Sorry, my C is pretty rusty… worth a try maybe.