Concurrent Data Transfer on GTX480

One of my awaiting feature from the Fermi is the Concurrent Data Transfers that is stated in NVIDIA CUDA Programming guide
“Devices of compute capability 2.0 can perform a copy from page-locked host memory to device memory concurrently with a copy from device memory to page-locked host memory.”

But right now when I try this feature on my GTX 480, i could not make this happen. Is there any other condition for Concurrent Data Transfer or simply GTX 480 is incapable of doing that.

Is that a bug, if it is then when the bug fixes are released.
If some one could make it happen, could you please post a snap of the code.

Thank you for the information

According to Sumit here:…amp;pid=1073830

The GeForce series only has 1 DMA engine, and so cannot do the bidirectional transfer. Only the Fermi-based Tesla cards can do this.