Hi, I want to use memory-mapped IO to map a large file (>100GB) and then make it accessible to the GPU. However, I’m not sure how to do that. I’ve already tried a couple of things.
I tried to use
cudaHostRegisterMapped only works for small allocations, and
cudaHostRegisterIoMemory always returns an invalid argument error regardless of the allocation size. I think
cudaHostRegisterIoMemory could be a solution, but I don’t understand why using the flag always results in an invalid argument error.
GPUDirect Storage: I also considered using direct storage but didn’t find a way to use it to perform memory-mapped IO.
HMM: I think HMM would work for what I’m planning to do, but our GPUs don’t support it, so I would prefer another method.