I am running ubuntu 10.04 64-bit operating system (2.6.32-21-server-kernel). cuda toolkit version is latest 3.2 and driver is 260.24.
The GPU card used is Tesla C2050. I am not running the X server dince its a server machine. Therefore, I use the script provided by Nvidia to create /dev/nvidia* files.
Bandwidth numbers reported by bandwidthTest:
Host to device bandwidth: 380MB/s
Device to host bandwidth: 393 MB/s
Device to Device bandwidth: 91000 MB/s
I am getting only 32bit mask in my /proc/driver/nvidia/card/0 file.
Do somebody has clue as to what is happening here??
I am running ubuntu 10.04 64-bit operating system (2.6.32-21-server-kernel). cuda toolkit version is latest 3.2 and driver is 260.24.
The GPU card used is Tesla C2050. I am not running the X server dince its a server machine. Therefore, I use the script provided by Nvidia to create /dev/nvidia* files.
Bandwidth numbers reported by bandwidthTest:
Host to device bandwidth: 380MB/s
Device to host bandwidth: 393 MB/s
Device to Device bandwidth: 91000 MB/s
I am getting only 32bit mask in my /proc/driver/nvidia/card/0 file.
Do somebody has clue as to what is happening here??
If you start the NVIDIA X Server Settings GUI and select the C2050 what does it show for the bus type? Is it the full x16 width? It will probably show as Gen 1 but that should clock back up to Gen 2 speed when you run a CUDA program.
Did you use any special flags when you ran the test? Running the bandwidth test with “–memory=pinned” should yield the best performance and I’ve seen some cases where the default pageable memory is really slow but pinned memory is fine. Usually a system BIOS update resolves that. There may also be NUMA effects coming into play so you could try using numactl and play with the affinity or interleaving.
If you start the NVIDIA X Server Settings GUI and select the C2050 what does it show for the bus type? Is it the full x16 width? It will probably show as Gen 1 but that should clock back up to Gen 2 speed when you run a CUDA program.
Did you use any special flags when you ran the test? Running the bandwidth test with “–memory=pinned” should yield the best performance and I’ve seen some cases where the default pageable memory is really slow but pinned memory is fine. Usually a system BIOS update resolves that. There may also be NUMA effects coming into play so you could try using numactl and play with the affinity or interleaving.
I am not running X server right now. When I was running before, it used to show the slot as x1 Gen1. Also, I ran both with pinned and pageable memory options. But, the results are not much different.
Any ides on why the DMA mask in /proc/driver/nvidia/card/0 is 32 bits. Normally, it is 40 bits.
I am not running X server right now. When I was running before, it used to show the slot as x1 Gen1. Also, I ran both with pinned and pageable memory options. But, the results are not much different.
Any ides on why the DMA mask in /proc/driver/nvidia/card/0 is 32 bits. Normally, it is 40 bits.