GPUDirect RDMA - Module can not be insert into kernel

Hi,

Sorry for the missing.

Currently, nvidia.ko and nv-p2p.ko cannot be added together due to the symbol issue.
Without nvidia.ko, the graphic driver cannot be loaded, so the display and some feature that requires the graphic driver won’t work.

But a pure CUDA application should not be affected.

We are discussing the possible fix with our internal team.
Will share more information if we got more feedback.

The answer to your question:

  1. Yes
  2. nv-p2p.ko only
  3. Unfortunately no.

Thanks.

2 Likes

Thanks for clarifying. I hope you can find a solution soon.

Has there been any progress with this issue?
I have ran into the same scenario as the previous users.
While being able to use the PCI-e RDMA functionality is great, not being able to simultaneously display anything is problematic.

Any updates would be appreciated!

2 Likes

Hi,

We have passed this request to our internal team.

Since it requires modification in both nvidia.ko and nv-p2p.ko, it won’t be a quite fix.
We will update more information with you.

Currently, the workaround need to turn-off nvidia.ko module.
Thanks.

1 Like

So I possibly have a “quick” workaround that may work. I will need to test further.

On the Jetson Orin, I downloaded public_sources.tbz2. I then modified the source for nv-p2p.c & nv-p2p.h located here:

Linux_for_Tegra/source/public/nvidia_kernel_display_driver_source/kernel-open/

I changed all the exported function names.
For example from this:

nvidia_p2p_dma_unmap_pages
nvidia_p2p_get_pages
nvidia_p2p_put_pages
nvidia_p2p_dma_map_pages
nvidia_p2p_free_page_table

To this:

nvidia_p2p_dma_unmap_pages_old
nvidia_p2p_get_pages_old
nvidia_p2p_put_pages_old
nvidia_p2p_dma_map_pages_old
nvidia_p2p_free_page_table_old

I then rebuilt nvidia.ko using make in

Linux_for_Tegra/source/public/nvidia_kernel_display_driver_source/

After a successful build, I replaced the original nvidia.ko located at

/lib/modules/5.10.104-tegra/extra/opensrc-disp/nvidia.ko

With my newly built nvidia.ko module.

Afterwards, I loaded the nv-p2p module:

$ sudo insmod /lib/modules/5.10.104-tegra/kernel/drivers/nv-p2p/nvidia-p2p.ko

and finally I was able to load the picoevb-rdma module

sudo insmod picoevb-rdma.ko

Hopefully this helps.

Regards,
YK

2 Likes

Hi,

Thanks for sharing this.

We are double-checking if this can work as expected with our internal team.
Will let you know the feedback and thanks again for sharing.

1 Like

Hi,

The workaround should work.
Thanks for sharing this.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.