GPU Direct RDMA - nvidia_p2p_get_pages returns -EINVAL

Hello,

I am modifying an acquisition board Linux driver in order to make the acquisition board transfer the data directly to my GTX 1080. For that, I use the GPU Direct RDMA technology described in the CUDA Toolkit v8.0.

Badly, when the driver calls nvidia_p2p_get_pages in order to retrieve physical addresses for the buffer the application allocated on the GTX 1080 memory, the function return -EINVAL (-22).

I really don’t understand why this function return -EINVAL.

  • The first two parameters (p2p_token and va_space_token) are 0. I don’t want to use the deprecated tokens and the same process allocate the buffer and call the driver calling nvidia_p2p_get_pages.
  • virtual_address parameter is the virtual address the application received from cudaMalloc. This address is 64 KiB aligned.
  • length parameter is the size of the allocated buffer (2 * 1024 * 1024 bytes)
  • page_table parameter is the address of a (nvidiat_p2p_page_table *) variable initially NULL as in the sample I found.
  • free_callback parameter is the address of a function
  • data parameter is the address of my context structure

Do y have to configure the NVIDIA driver in some “mode” in order to use the GPU Direct RDMA technology?

Do you know something that could make nvidia_p2p_get_pages fail and returns -EINVAL?

Any hint will be greatly appreciated.

Thanks in advance,
Martin

The CUDA driver is pretty picky when it comes to p2p configurations. Are you sure your motherboard is suitable for p2p applications? I’m not very familiar with the nuances of this stuff so I moved this to the CUDA forum. Hopefully someone with more CUDA-specific expertise can help.

Thanks,

I already posted the same question on the CUDA forum. And I learn that I need to use a Tesla or Quadro board in order to get the GPU Direct RDMA working.

Martin