cuMemHostAlloc and memory alignment


Disclaimer: I’m only just implementing this at present, so I’m not sure if cuMemHostAlloc is aligned or not yet - I will report back when finished.

The reference manual doesn’t specify if the memory returned to cuMemHostAlloc is aligned, and if so - to what boundary…

I’m attempting to hack up my own poor mans DMA on an ION system, directly from my source device into pinned memory - and accessing that memory via zero-copy - however our device driver requires 32-byte aligned memory…

I’m just wondering if it’s possible for cuMemHostAlloc to take an alignment flag in the future (or maybe a special cuMemHostAllocAligned function call for this purpose) - as I’m sure it won’t be an uncommon case to have alignment requirements on pinned memory - especially on device with zero-copy access…

Any comments would be great!
(Even better, any dodgy hacks to try and get N-byte aligned memory out of cuMemHostAlloc? :D)


Thought I’d come back and give some feedback.

cuMemHostAlloc appears to always return 32byte aligned memory addresses - I’m not sure if this is intentional (it’s not documented to my knowledge) - and if this is true for all systems, but for Windows XP (CUDA 2.2) - this is the case.