I use altera FPGA implement a PCIe card to exchange data with GTX1060.
I use the pinned memory with the address space of PCIe card as cudaHostRegister(pci_data,…), they should exchange
data by DMA, when GTX1060 as master and PCIe card as slave.
When data from GPU to PCIe card, as cudaMemcpy(pci_data, dev_data,…), the PCIe card gets the correct data, and the GPU memory rate in Nsight is 2GB/s.
When data from PCIe card to GPU, as cudaMemcpy(dev_data, pci_data,…), GPU gets the error data, all the data is 0xffff_ffff. and the GPU memory rate in Nisght is 9GB/s. I check the FPGA signal, there is not any read or write request.
Why the GPU can’t get the correct data from PCIe card by cudaMemcpy() as pinned memory. How can I solve this problem?
PS: When I use unpinned memory by only delete cudaHostRegister(pci_data,…) , the data exchange between the GTX1060 and PCIe card are all correct. But the exchange don’t use DMA, the rate is very slow.