I have one Xavier AGX (the End-Point) connected with an x86 host (Root-Complex).
I use the “pci-epf-nv-test” driver on EndPoint and the “tegra_ep_mem” driver on my x86 host (for DMA read/write operation to/from the Xavier EP shared memory).
I modified the “pci-epf-nv-test” to inscrease the size of shared memory (as described in this topic).
How can I access to the pointer returned by “dma_alloc_coherent” directly in a CUDA kernel?
Cuda kernels can only access memory allocated using cuda APIs. In this case, you could use cudaHostRegister (with cudaHostRegisterIoMemory Flag), see this post as reference on how to register external memory into CUDA.
GPUDirectRDMA is primarily to expose CUDA memory to outside rather than importing external memory. If my understanding of the workflow is correct, memory allocated on Tegra-Xavier (is it using CUDA API or otherwise?) needs to be accessed by GPU on Desktop.
Assuming it is Cuda allocated memory on Tegra, you could map that memory on the Desktop side using GPUDirect-RDMA. See here for details. Once the memory is mapped on Desktop, you could use the above mentioned cudaHostRegister approach.