misaligned pointer from cudaHostGetDevicePointer

Hi,
I have a block of memory on the host containing 4 byte integers, and I register it with cudaHostRegister (with the cudaHostRegisterMapped flag), and then get a device side pointer with cudaHostGetDevicePointer. It works fine most of the time, but occasionally I get a pointer from cudaHostGetDevicePointer that is not 4 byte aligned, and then I get cudaErrorMisalignedAddress when I access it device side.

Is there a way to fix this, where I can force the pointers I am given to be 4 byte aligned?

Thanks.

what if you use cudaHostAlloc() instead to pin the memory?