Dear all,
I’ve recently put together a system with ASUS p9x79ws motherboard, intel i7-3970X cpu, 48 gb of ddr3-1866 ram and
an ASUS GTX-TITAN.
I’m using Ubuntu 12.04 LTS,
I’ve installed the drivers 319.32 and I am using it with NVreg_EnablePCIeGen3=1, so that I see PCIe 3.0 speeds.
Device 0: GeForce GTX TITAN
Quick Mode
Host to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 11247.0
Device to Host Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 11236.9
Device to Device Bandwidth, 1 Device(s)
PINNED Memory Transfers
Transfer Size (Bytes) Bandwidth(MB/s)
33554432 220495.0
I’ve CUDA 5.0.35_linux_64_ubuntu11.10
When I was compiling the NBODY tests, the compiler complained about double precision not being supported. I’ve edited the Makefile to pass
-gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=sm_35
to the compiler, and the warning disappeared.
Now, I am trying to repeat benchmark figures in the post GTX Titan drivers for Linux 32/64 bit release? - Linux - NVIDIA Developer Forums
However, I have:
./nbody -benchmark -numbodies=229376 -device=0 -fp64
> Compute 3.5 CUDA device: [GeForce GTX TITAN]
number of bodies = 229376
229376 bodies, total time for 10 iterations: 91547.242 ms
= 5.747 billion interactions per second
= 172.414 double-precision GFLOP/s at 30 flops per interaction
Which is 1/4th of the reported values.
The float precision is okay,
./nbody -benchmark -numbodies=229376 -device=0
> Compute 3.5 CUDA device: [GeForce GTX TITAN]
number of bodies = 229376
229376 bodies, total time for 10 iterations: 5238.231 ms
= 100.441 billion interactions per second
= 2008.821 single-precision GFLOP/s at 20 flops per interaction
Do you have an idea what is going wrong?