Bad PCIe transfer performance (cudaMemcpy), what can cause that?

As far as I know cudaMemcpy host to device should be around 2.5GB/s using standard malloc and around 5.5 GB/s for pinned memory (at least for large enough copies).

I’m seeing around 0.5GB/s for 16KB copies and around 1.5 GB/s for 1MB copies. With pinned memory 1MB copies do around 2 GB/s.

The machine is a core 2 due e8400 with 2GB of ddr3 memory in two channels (two memory sticks). Tried both with a gtx285 and a gt240. I also checked that it is using PCIex16 (it does go down to PCIex8 when I put two cards in)

Motherboard is a DP45SG (p45 chipset)

Any idea why the bad performance and/or how can I improve it (or how do I choose more appropriate hardware)?

Thanks

I suspect it’s partly explained by per-call overhead. 0.5GB/s in 16KB chunks is 32000 calls per second. You could check this to see if 8kb chunks gives the same transaction rate or not. Obviously you’re not near the bandwidth limits at all.

I suspect it’s partly explained by per-call overhead. 0.5GB/s in 16KB chunks is 32000 calls per second. You could check this to see if 8kb chunks gives the same transaction rate or not. Obviously you’re not near the bandwidth limits at all.

Yes, but 2GB/s with pinned memory is also less than half the bandwidth, and that is for 1MB copies, it seems to me like something else in the setup is limiting the bandwidth as well (host to host copies also seem slow with this machine, so I think that it’s something specific to the hardware that I’m missing)

Yes, but 2GB/s with pinned memory is also less than half the bandwidth, and that is for 1MB copies, it seems to me like something else in the setup is limiting the bandwidth as well (host to host copies also seem slow with this machine, so I think that it’s something specific to the hardware that I’m missing)

What OS are you running?

GPU-Z : shows what the PCI-E link speed is.

CUDA-Z: bandwidth benchmark

CPU-Z: motherboard info

If it is windows, make sure you have all motherboard chipset software installed. Also it may be a power saving feature, link speed/ pci-e 2/1.0 issue. Maybe try disabling ACPI in bios. Also have the latest motherboard bios.

What OS are you running?

GPU-Z : shows what the PCI-E link speed is.

CUDA-Z: bandwidth benchmark

CPU-Z: motherboard info

If it is windows, make sure you have all motherboard chipset software installed. Also it may be a power saving feature, link speed/ pci-e 2/1.0 issue. Maybe try disabling ACPI in bios. Also have the latest motherboard bios.

How to run “GPU-Z”, “CUDA-Z” or “CPU-Z”? I met the same problem, the bandwidth between host and device is slow, and my GPU is GTX260, my CPU is Quad Q8200. I want to check my PCIe, but I do not know how to perform it. Could you say it in a more detail, thanks a lot. I am a freshman in CUDA, but i am very interested in it.

How to run “GPU-Z”, “CUDA-Z” or “CPU-Z”? I met the same problem, the bandwidth between host and device is slow, and my GPU is GTX260, my CPU is Quad Q8200. I want to check my PCIe, but I do not know how to perform it. Could you say it in a more detail, thanks a lot. I am a freshman in CUDA, but i am very interested in it.

Hi,
just now i looked for GPU-Z, CUDA-Z and CPU-Z in google, and i’ve got them all. Thanks a lot.

Hi,
just now i looked for GPU-Z, CUDA-Z and CPU-Z in google, and i’ve got them all. Thanks a lot.