GPUDirect RDMA - use Jetson's DMA

Hi,
In the GPUDirect RDMA Github demo:

You use the FPGA’s DMA engine to copy the data from the GPU’s memory space into the FPGA memory.
Is it possible to use the Jetson’s DMA engine instead? That it will copy the data into the FPGA memory?

Thanks

Hi,

RDMA is about direct access, no memory copy is involved.

GPUDirect RDMA’s target is to share the cudaMalloc buffer with FPGA.
It can not used in the counterpart.

Thanks.

@AastaLLL
Thank you for your reply.

Just to make sure, what do you mean by “counterpart”?

Thanks

Hi,

FPGA can access CUDA memory.
But CUDA cannot access FPGA’s memory.

This requires importing/wrapping the CPU memory into GPU space.
Which is not supported.

Thanks.

Thank you for your reply @AastaLLL
There is something small I didn’t understand:

In the demo the FPGA can access CUDA memory.

“Someone” has to program the FPGA’s DMA engine with appropriate data:

  • physical source address, corresponds to CUDA virtual space
  • physical destination address, in the FPGA
  • length
  1. Who does this? - the CUDA code?
  2. Why he can’t program the Jetson DMA to do the same work instead of the FPGA DMA?

Thanks for the help

Hi,

Could you share the demo you mentioned with us?
So we can know more about the questions.

Thanks.

Sure:

Hi @AastaLLL ,
Any new regarding my question?

After I sent you the demo link, maybe you know to answer my questions?

Hi,

Thanks for your patience.

1. No, it is ioctl call rather than a CUDA code:
For example: https://github.com/NVIDIA/jetson-rdma-picoevb/blob/master/client-applications/rdma-cuda-c2h-perf.cu#L105

2. The sample provides an interface for FPGA to access CUDA memory.
If you want to access FPGA memory on Jetson, please check with the FPGA provider to see if they have a similar API/mechanism.

Thanks.

Hi @AastaLLL ,
Thank you for your reply.
I apologies but I didn’t understand your answer.

You said that the code being executed on the GPU uses an ioctl to program the FPGA’s DMA core for executing the transfer from the Jetson’s memory (GPU address space) into the FPGA’s memory.

My question was if its possible to use another method , maybe another ioctl , to program the Jetson’s DMA core for executing the transfer from the Jetson’s memory (GPU address space) into the FPGA’s memory.

In other words, instead of the FPGA’s DMA core will do the transfers, we want the Jetson’s DMA core to do the transfers.

Hi,

CUDA doesn’t support importing a pre-allocated memory into CUDA memory space.
But if you don’t need this for GPU, the memory mapping should work with a standard Linux DMA way.

Thanks.

I apologize,
I don’t understand your answer.
Maybe I’m not familiar enough with the details.
I’ll appreciate if you could elaborate a little.

In the demo we see the ioctl that utilize the FPGA’s DMA core to perform the transfer. The code configures the FPGA’s DMA core with the suitable physicals addresses.

Why can’t it do the same with the Jetson’s DMA core?
Using another ioctl that will load the same data physical address into the Jetsons DMA core instead of the FPGA’s DMA core and initiate the transfer with the Jetson’s DMA.

Hi,

Do you mean PCIe DMA?
If yes, it can work just as the Linux standard way.

If not, could you tell us more about the “Jetson’s DMA” you indicated?

Thanks.

Hi,

Not sure if this helps, but please check the below topic discussing DMA engine.

Thanks.

Yes, I mean the general DMA engine in the Jetson (not a specific device DMA)

What I meant was that instead of activating the FPG DMA from the CUDA code, could we just activate the PCIe DMA from the CUDA code on the GPU?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.