Device to Host Pinned Memory Third party hardware DMA transfers fail

Greetings,

Just updated from from Cuda 2.0 beta 1 to Cuda 2.0 release and am having an issue with pinned host memory.

I use Cuda to process video, the last stage of this processing involves a device to host transfer to pinned memory. Once this transfer is complete, I use this pinned host memory in a DMA transfer to a third party video display device (AJA or Bluefish). These DMA transfers worked perfectly with Cuda 1.1 and even the beta Cuda, but appear to be failing with the final release of Cuda 2.0.

Any idea’s what has changed in the host pinned memory between beta and release?

Windows XP, 32 bit, Dell T7400 workstation, Quadro 5600

BTW, I also support the Deltacast video adapter which does not support transferring from nVidia’s host pinned memory, but provides it’s own buffer of host pinned memory. This device to host transfer (sync) is working fine, so it appears to be related to host pinned memory supplied by Cuda.

Can you provide a small test app which reproduces the problem?

I could, but the failure doesn’t occur on the device to host transfer, it occurs on the host to third party device transfer.

So with AJA as an example, I do the following steps:

  1. Device to host pinned memory transfer
  2. Wait for transfer to complete
  3. Assign CUDA host pinned memory as source for DMA transfer to AJA device memory
  4. Engage and monitor DMA transfer via DMA controller on AJA device

Step 4 fails. It appears that something about the host memory created by CUDA has changed from beta to release version.

Do you have any of these broadcast video devices (AJA or Bluefish)? If so, then I can work up a quick app to demonstrate this.

Thanks for your quick response. We just did our first release today and are using beta 1, but would really like to resolve this so we don’t have to use beta versions.

Probably somewhat pointless for me to respond since I guess only NVidia people have enough information to help, but:

What exactly does “fail” mean? Do you just get the wrong data, does the DMA controller return an error or…?

E.g. I could imagine that the GPU DMA engines support lots of things, so maybe the host memory is no longer always aligned to 4kB, which your DMA controller might not support.

Good question.

When I start the DMA transfer, it is a blocking call that monitors the transfer until complete then it will return success or fail. With the beta version of CUDA, it would block for one to two ms, then complete. With the release version of CUDA, it will block for about 1 second then complete with a failure. No data is transferred to the devices memory.

I don’t have a complete understanding of DMA scatter-gather, so please forgive me if this sounds ignorant.

I’ve been doing some reading and can see that DMA controllers can use scatter-gather to transfer from device to mapped memory. I’m assuming that this mapped memory does not have to be linear, so there must be some type of vector table. If this is correct, could this table have changed between the beta and release version?

Crazy.

I uninstalled the video driver and reinstalled, and now all is working fine. Don’t know what the issue was, but it appears to be resolved at this moment. I will continue to test to make sure this is not a random problem.

Well, it appears that I spoke too soon. With additional testing, I found the results to be random with CUDA release version 2.0. Some times DMA transfer completely fail, other times they appear to be out of alignment and sometimes they work just fine. Starting and stopping my application has no affect on this behavior. Rebooting the computer is the only action that changes these results. So if the computer boots up and DMA transfer are working, they will continue to keep working until the computer is restarted. If computer boots up and DMA transfers are misaligned, they will stay misaligned til the next reboot.

Rolled my machine back to version 1.1 of CUDA and all is solid. Didn’t have this issue with Cuda 2.0 beta 1, but did have the issue with beta 2.

Anyone have any ideas???

Thanks,

Kurt