Bandwidthtest on Tesla S1070

I’ve just started trying to tune my application for the Tesla S1070 but I’m getting lower than expected memory bandwidth. If I run the SDK example Bandwidthtest on my GTX 260 (which has a theoretical bandwidth of about 120GB/s) I get about 91GB/s. If I run it on the S1070 (which has a theoretical bandwidth of about 104GB/s) I get about 72GB/s. This seems a little (about 10%) low. I’m using Windows Server 2008 R2 with the Tesla drivers in TCC mode (and I’m running in a remote desktop session). For the GTX 260 I’m using Windows XP Pro 64-bit with normal GeForce drivers.