Can we used cuMemExportToShareableHandle/cuMemImportFromShareableHandle to simulate the function of cudaIpcGetMemHandle/cudaIpcOpenMemHandle?

Hello everyone, I meet an issue of cuMemImportFromShareableHandle, I can run the example of NVIDA Offical Sample named memMapIPCDrv, but when i split the export and import to two process, and transfer the shareable handle produced by cuMemExportToShareableHandle through by socket to the other process, and then cuMemImportFromShareableHandle used the shareable handle , it will return an error CUDA_ERROR_INVALID_DEVICE. So I want to know whether the cuMemExportToShareableHandle/cuMemImportFromShareableHandle can fully simulate the function of cudaIpc*, if can, how can i used the cuMem* api to simuate the function? Thanks!

Hello! I’ve encountered this problem too recently. To be specific, when I try to export pre-allocated handles in one python process to another via socket, the import routine returns with the same CUDA_ERROR_INVALID_DEVICE status. However, there is nothing wrong if I just export-then-import handles in the same process. Have you solved this problem in the past year?

The issue I encountered was due to a misunderstanding of operating system concepts. After carefully examining CUDA sample examples, I realized that I was passing exported handles (integers), which are actually file descriptors of the original process, to another process. Consequently, the target process couldn’t correctly interpret the file descriptor numbers.

The proper solution involves passing file descriptors using system calls, such as sendmsg in socket.h, along with corresponding flags. In Python, you can achieve this using the send_fds method in the socket module.

This approach ensures that the file descriptors are properly transmitted between processes, allowing the target process to accurately recognize and utilize them.