From Nvidia’s GPUDirect RDMA pagehttp://docs.nvidia.com/cuda/gpudirect-rdma/index.html nvidia_p2p_get_pages() requires a call to another function to “be invoked if the pages underlying the virtual address range are freed implicitly. Cannot be NULL.” I don’t really know what this means and am hoping someone can provide a better explanation. Also the example for this free_callback() is:
void free_callback(void *data)
{
my_state *state = data;
wait_for_pending_transfers(state);
nvidia_p2p_free_pages(state->page_table);
}
I am new to kernel programming and am wondering what actual function could be used in place of “wait_for_pending_transfers(state)”? I believe this call is crashing my kernel when my device driver grabs GPU memory as can be seen in this segment of a vmcore-dmesg.txt crash report:
nvidia 0000:82:00.0: irq 135 for MSI/MSI-X
ioctl_dev: mapping GPU page…
ioctl_dev: BOOM! pinned 65536 bytes of GPU memory…
releasing GPU page…
BUG: unable to handle kernel NULL pointer dereference at 0000000000000034
IP: [] nvidia_p2p_free_page_table+0x39/0x70 [nvidia]
PGD 0
Oops: 0000 [#1] SMP
Thanks for your help. If you think a glimpse of my kernel code would be useful, I can post that too.