How to memcpy in linux driver directly to GPU?


I have a 3rd party device that is filling a linux kernel buffer with data. I would like to (in the driver) copy that data directly to GPU.

I think is sort of like gdrcpy except that I want to do the memcpy in the driver.

My first attempt was to:

  1. User Space: Allocate GPU memory using cudaMalloc and set the CU_POINTER_ATTRIBUTE_SYNC_MEMOPS attribute
  2. Linux Kernel Space: Pin that pointer memory using nvidia_p2p_get_pages
  3. Linux Kernel Space: Use the page physical addresses provided from step 2 to memcpy(gpu_pinnned_page, kernel_cpu_pinned_page, size)

This memcpy fails spectacularly. (entire server requires a hard power cycle.)

What step did I miss? I can add the mmap from gdrcopy but is it only for user space? Or is there some pointers created in the mmap call that I can also use to do the memcopy in the driver?

hi Brandy

You can referernce with such doc:

Hello, thank you. Yes, of course, I have referenced that document and it works great from user space. But I wanted to keep the interactions in the driver. So what I needed to do was memremap the nvidia page table physical addresses to a kernel virtual address. Then I could keep all the interactions in the kernel.

here is the snippet of code

struct gpumem_t {
    void *handle;
    u64 dptr_start;  //device pointer from cudamalloc in user space
    u64 dptr_end;
    u64 nv_page_size;
    nvidia_p2p_page_table_t* page_table;
    uint64_t * kernel_mappings;

struct gpumem_t *gcb = NULL;

error = nvidia_p2p_get_pages(0, 0, gcb->dptr_start, pin_size, &gcb->page_table, free_nvp_callback, gcb);
    if(error != 0) {
        pr_err("Error in nvidia_p2p_get_pages()\n");
        error = -EINVAL;
        goto do_free_mem;

        int i;
        gcb->kernel_mappings =  (uint64_t*)kzalloc(gcb->page_table->entries * sizeof(uint64_t), GFP_KERNEL);
        pr_info("nvidia num pages pinned: %d\n", gcb->page_table->entries);
        for ( i=0; i < gcb->page_table->entries; ++i) {
            void *vaddr = memremap ((u64)(gcb->page_table->pages[i]->physical_address), gcb->nv_page_size, MEMREMAP_WC );
            if (vaddr) {
                gcb->kernel_mappings[i] = (uint64_t)vaddr;
            } else {
                pr_err("Cannot map this page to kernel space 0x%016llx\n", gcb->page_table->pages[i]->physical_address);
                gcb->kernel_mappings[i] = 0;
            pr_info("page[%d]=0x%016llx kern_map=0x%016llx%s\n", i, gcb->page_table->pages[i]->physical_address, 
            gcb->kernel_mappings[i], (i>200)?"and counting.":".");

hi brandy

Such detail requirement I suggest you contact Nvidia support.

Thank you
Meng, Shi

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.