GPUDirect RDMA - Multithreading


I am writing a driver which uses the GPUDirect RDMA on the Jetson for direct access to CUDA-Memory. I have a number of distinct data streams, which are transfered via DMA. The different streams are managed by different threads (in the userland). Via fileopts I can manage these transfers.
Can I call the interface (see below) from different threads without additional precautions, like mutex or spinlocks? Or do I need some sort of synchronization? Due to the architecture the calls for one data-region/stream are always sequential. But I could e.g. have two parallel calls to nvidia_p2p_dma_map_pages for different memory regions.

I am using the following functions:

  • nvidia_p2p_free_page_table
  • nvidia_p2p_put_pages
  • nvidia_p2p_get_pages
  • nvidia_p2p_dma_map_pages
  • nvidia_p2p_dma_unmap_pages

Best regards and thanks in advance,

Our team is investigating now, will do the update soon.


By checking with internal team ,they said it should be ok to call these parallely for different memory regions.
For additional info. look at the sample gdrcopy/gdrdrv.c at master · NVIDIA/gdrcopy · GitHub.

Thank you very much for the clarification and link to the corresponding source file.