I have tried this on two separate machines. One with Dual GTX590s and one with Dual GTX580s. I have tried various drivers from 270.35 -270.41.34 including the one with the 4.0 final release as well as 275.36 and 285.05.09 with similar results. They all pass enabling peer access and UVA and report 6.19GB/s for the 590 and half that for the 580. The 580’s report an X8 bus. In all cases they report verification error, sometimes starting at different locations. I have seen reports that this app runs successfully on the GTX590. If anybody has an Idea on what I might be doing wrong, please make a suggestion.
Hubert
Update
This seems to be some type of timing issue. I modified the code for my purposes to test all available fermi GPUs and added intermediate verification of the results. After adding the additional verifications the test ran with no problems. I will explore further but with the new 4.1 RC1.
Hubert
The version of simpleP2P in the RC fixed all problems. In addition it worked even when extraneous cuda devices were present on the system. I have some 1.x motherboards for display and dedicate my other cards to cuda.
The other changes I noticed was the change away from cutilSafeCal and the use of cudaDeviceSynchronize after the kernel runs.