What is the full potential of my GPU?

For quite some time now I sensed that the data transfers from host to device and back are way too slow and that the bottleneck is really problematic for me.
I’m using GeForce 8600 GT on HP Compaq 6100 MT, in a PCI Express slot.
If I understand it correctly then the max speed of the PCI Express (which the GPU should be using to the fullest) is 4GB/s.
I’m taking this bit of information from here:

However, when I run bandwidthTest.exe from the SDK I get host-to-device bandwidth of ~600-650 MB/s and device to host of ~750-800 MB/s. These transfer rates seem compatible with the results I’ve measured in other CUDA code I wrote in the past.

My question is, is this normal? These speeds are almost 1/8 of the max potential (if I understand the max potential correctly). Is this a faulty card or motherboard?

Thank you…!


Some mainboards don’t even have 16 PCI express lanes. For example my Asrocck 4CoreDual SATA2 only has 4 lanes connected. It’s a limitation of the chipset. Before you suspect a bug, find out about your mainboard specs.


According to the computer documentation - it has a x16 PCI Express (to which the card is connected).

You can see it here under “Expansion Slots”:



That document only says it has a x16 slot (physical form factor), it does not explicitly say how many lanes are actually connected.

Try upgrading your BIOS and chipset drivers to the latest available versions: it helped a lot of people here.

BTW, I have a true 16x PCI 1.0 connection and my G92 8800 tops at about 2.8(H2D)-2.1(D2H) GB/s , using pinned memory (–memory=pinned)


BIOS and chipset have been updated to their latest available versions… and it didn’t help. :(

I’ve been having the same transfer rates on my 8600GT and now on 8800GTS. That is if I measure with pageable memory. Have you tried running bandwidth.exe with -memory=pinned (if I remember correctly)? In my case I was getting 2-2.5 GB/s with pinned memory. You might also try increasing the data size from the default 32MB to 64 or even more.

My motheboard is a Gigabyte GA-K8N51GMF-9-RH (pretty old Socket 939 board based on NVIDIA GeForce 6100 chipset).

Also the real max potential you might expect (ie. practical) is around 2.5 GB/s, up to 3GB/s with a really good mobo. 4GB/s is just for marketing.

Nope. Same speeds for pinned memory, same speeds for increased data size.

I tend to blame the mother board at this point…

Thanks everyone.


If you’re getting the same speeds with pinned and pageable, it’s almost certainly a motherboard problem. I’ve never seen pinned be less than 1.5x pageable.

Pinned is maybe 10% faster, maybe less. Not x1.5, that’s for sure…