Registering POSIX-CPU shared memory to CUDA with cudaHostRegister

Hardware: I am using a Jetson AGX Orin.

I am trying to set an IPC between multiples processes to stream a video. I use a classic shm_open/mmap shm mechanism, then I try to map it on CUDA with pinned/locked memory.

void SharedMemory::registerCudaMemory() {
  if (!checkCudaZeroCopySupport()) {
    std::cerr << "CUDA device does not support zero-copy, using POSIX shared "
                 "memory instead\n";
    return;
  }

  cudaError_t err = cudaHostRegister(shm_ptr, size, cudaHostRegisterMapped);
  if (err != cudaSuccess) {
    std::cerr << "Failed to register host memory with CUDA:"
              << cudaGetErrorString(err) << std::endl;
  }

  err = cudaHostGetDevicePointer(&cuda_ptr, shm_ptr, 0);
  if (err == cudaSuccess) {
    std::cout << "CUDA device pointer for registered memory: " << cuda_ptr
              << std::endl;
    use_cuda = true;
  }
  if (err != cudaSuccess) {
    std::cerr << "Failed to get CUDA device pointer for registered memory: "
              << cudaGetErrorString(err) << std::endl;
    cudaHostUnregister(shm_ptr);
    shm_ptr = nullptr;
  }
  // Check if the pointers are the same for zero-copy
  if (shm_ptr == cuda_ptr) {
    // use_cuda = true;
    std::cout
        << "Zero-copy memory confirmed. shm_ptr and cuda_ptr are the same.\n";
  } else {
    std::cerr << "Warning: Zero-copy memory expected but shm_ptr and cuda_ptr "
                 "are different.\n";
    std::cerr << "shm_ptr: " << shm_ptr << " cuda_ptr: " << cuda_ptr << "\n";
    cudaHostUnregister(shm_ptr);
  }
}

The problem is I always get a different address than my CPU. Also I am having trouble using another process to read the memory within that CUDA address. I try to share the pointer through an HTTP web server but keep having error with a cudamemcpy. I believe I might not be using the best of methods.
Looking forward for any help.
best regards,
Karim

I try to share the pointer through an HTTP web server but keep having error with a cudamemcpy.

That’s expected. Sharing pointer numerical values between separate processes is not a sensible method to attempt to share data. That statement is not unique or specific to CUDA. Generally speaking processes on windows and linux have unique and un-harmonized memory maps.

I believe I might not be using the best of methods.

The jetson forum participants have general recommendations for folks who are wanting to do IPC on Jetson. You might ask or check there. Here is one example.

This may also be of interest. As you can see there, I don’t really have anything constructive to offer on the idea(s).

I was wondering whether this could be used to solve my problem : GPU Inter-Process Communications(IPC) question - #4 by Robert_Crovella

Or is cuda zero copy ipc is impossible on jetson ?

The CUDA IPC API (e.g. cudaIpc...() is not supported on Jetson. It seems like this was mentioned to you here.

I understand sorry for the cross post.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.