I run your code on my new GTX480 fermi card with cuda sdk3.1beta, but still overlap bi-directional data transfer.
as NV announced that fermi should support two DMA transfer engine, which mean PCI-E duplex communication should be enabled.
I run your code on my new GTX480 fermi card with cuda sdk3.1beta, but still overlap bi-directional data transfer.
as NV announced that fermi should support two DMA transfer engine, which mean PCI-E duplex communication should be enabled.
I run your code on my new GTX480 fermi card with cuda sdk3.1beta, but still overlap bi-directional data transfer.
as NV announced that fermi should support two DMA transfer engine, which mean PCI-E duplex communication should be enabled.
Could someone clarify if bidirectional copies can be done on non-Tesla Fermis?
Tim said it can only be done on Tesla Fermis,
but fishbupt, says above it can be done on GTX 480?
I would like to know so I can decide if it’s worth to get a GTX 470 or 480 so another developer can test & benchmark, without buying another Tesla C2050.
I have seen bi-directional bandwidth in GTX480 in our lab. The result is reported it here
http://forums.nvidia.com/index.php?showtop…t=#entry1073830
But I also tested another system with GTX480 in it in some other lab. In that system I do not see the bi-directional bandwidth.
According to “sumitq” in the above link, Fermi consumer card should not have the bi-directional capability.
- Tesla 20-series has 2 DMA Engines (copy engines). GeForce has 1 DMA Engine. This
means that CUDA applications can overlap computation and communication on Tesla using
bi-directional communication over PCI-e.