I want to share memory allocated with cuMemAlloc between processes on Linux. I have done this before in 64-bit applications, and now I have a need to do it in a 32-bit application. However, when compiling 32-bit, cuIpcGetMemHandle keeps failing with error 201, CUDA_ERROR_INVALID_CONTEXT. Is this functionality not supported for 32-bit applications, or do I need to set things up differently? Here is a minimal example:
int main(void)
{
cuInit(0);
CUdevice device = 0;
cuDeviceGet(&device, 0);
int flags = 0;
CUcontext ctx;
cuCtxCreate(&ctx, flags, device);
int width = 100;
int height = 100;
CUdeviceptr devPtr = 0;
CUresult result = cuMemAlloc(&devPtr, (width * height * 4));
printf("cuMemAlloc result = %i\n", result);
CUipcMemHandle mem_handle = {};
result = cuIpcGetMemHandle(&mem_handle, devPtr);
printf("cuIpcGetMemHandle result = %i\n", result);
cuIpcCloseMemHandle(devPtr);
cuMemFree(devPtr);
return 0;
}
from here:
The IPC API is only supported for 64-bit processes on Linux and for devices of compute capability 2.0 and higher.
CUDA IPC is not supported for a 32-bit development target or in a 32-bit process (or, for that matter, on a 32-bit OS).
Are the cuMemExportToShareableHandle/cuMemImportFromShareableHandle functions supported in 32-bit processes?
My guess is no. The handle functions you refer to are effectively a subset of the Virtual Memory Management API, and that API requires a UVA setting:
Note that the suite of APIs described in this section require a system that supports UVA.
And UVA is mostly associated with 64-bits.
I realize that is probably not a conclusive answer. If you like, both mechanisms (UVA and Virtual Address Management system) have queryable properties indicating support. Those should be a good indicator, and at least for the Virtual Memory Management API, best practice is to always check those before using the API.
I would like to generally advise you that 32-bit development targets in CUDA have been deprecated for a long time, and 32-bit OS support is removed in recent versions of CUDA.