bi-directional PCI-E transfer overlap

I run your code on my new GTX480 fermi card with cuda sdk3.1beta, but still overlap bi-directional data transfer.

as NV announced that fermi should support two DMA transfer engine, which mean PCI-E duplex communication should be enabled.

I run your code on my new GTX480 fermi card with cuda sdk3.1beta, but still overlap bi-directional data transfer.

as NV announced that fermi should support two DMA transfer engine, which mean PCI-E duplex communication should be enabled.

Could someone clarify if bidirectional copies can be done on non-Tesla Fermis?

Tim said it can only be done on Tesla Fermis,

but fishbupt, says above it can be done on GTX 480?

I would like to know so I can decide if it’s worth to get a GTX 470 or 480 so another developer can test & benchmark, without buying another Tesla C2050.

I have seen bi-directional bandwidth in GTX480 in our lab. The result is reported it here

http://forums.nvidia.com/index.php?showtop…t=#entry1073830

But I also tested another system with GTX480 in it in some other lab. In that system I do not see the bi-directional bandwidth.

According to “sumitq” in the above link, Fermi consumer card should not have the bi-directional capability.

- Tesla 20-series has 2 DMA Engines (copy engines). GeForce has 1 DMA Engine. This

means that CUDA applications can overlap computation and communication on Tesla using

bi-directional communication over PCI-e.