Sharing PagedLockMemory between Processes

Hello,

sorry for starting a new topic about a Whish. But i was not able to reply to the Whishlist at: http://forums.nvidia.com/index.php?showtop…rt=#entry248850 because it was locked.
My prolem is that i need to share Host Memory buffers between different processes on a linux system. Currently i am using Linux System V shared memory segments for this. I need to copy this buffers to CUDA Device memory. As i can not register an existing buffer with the CUDA runtime for dma transfers and i can not attach a buffer allocated with cudaMallocHost too a second processes i am currently out of luck and can not use dma transfers.
This threads also discuss a similiar prolbem
http://forums.nvidia.com/index.php?act=ST&…=71&t=41710
And something like this has been already mentioned on the Wishlist here
http://forums.nvidia.com/index.php?showtop…aded&start=

As these two post both are quite old i want to know if there is anything new about this issue? To solve my problem i see posibilities:

  1. Extend the CUDA Runtime with a function like shmget and shmat from linux Sytem V to be able to share a buffer created by cudaMallocHost between processes.
  2. Create a buffer with shmget, pagelock that buffer with shmctl and register this paged locked buffer with the CUDA runtime.
    Please correct me if i am wrong or have overlocked something.

The dma transfers are important for me in two ways:

  1. They are faster.
  2. They enable asynchornous launches with cuda streams.

Best regards

Jiri Kraus

I’d like to be able to do the same thing! I already asked for this a while ago in a wish-list topic. Some advantages i see:

    Ability to really separate I/O from algorithm

    easily switch algorithm or I/O at runtime

    i’m sure there are more!

same here