I am currently using the cuMemHostAlloc for the performance advantages of pinned, write-combined memory.
I would like to add the ability for a separate processes to place data directly into that memory. As far as I know, there is no way to share that cuMemHostAlloc’d memory with another process.
The other option, instead of sharing CUDA malloc’d memory, is to pin/write-combine a buffer allocated with shmget(), but I doubt CUDA would treat the memory the same way.
Obviously I could just allocate interprocess memory with shmget() and memcpy to the pinned/WC memory before upload, but the extra memcpy will pretty well negate the benefits of using the CUDA malloc in the first place.