What factors effect GPU transfer speed?

jcog · September 15, 2009, 2:41pm

Hey all. I’ve got a few questions about what factors can impact the speed of GPU data transfer on different machines. I’m working with a GTX 295, and it gets two different data transfer speeds on two different machines, 1000 Mb/s on one and 1900 Mb/s on the other. Both machines are operating on a PCI-e x16 slot, and have similar hardware, with the second machine having a better cpu and more RAM. I’m mainly interested in what factors could account in this bandwidth difference so I can research the problem a little more effectively. Thanks for any help.

seibert · September 15, 2009, 2:43pm

Are these pinned (“page-locked”) memory transfers?

jcog · September 15, 2009, 2:49pm

It’s the bandwidth test in the SDK that I’m using for testing, so it tests both pinned and standard memory with the same results.

_Big_Mac · September 15, 2009, 3:32pm

I found that FSB frequency can be a limiting factor on Intel platforms (pre-i7).

Generally, if the PCIe slot is not “faked” or broken, the prime suspect for low HtD/DtH transfers is the host’s effective memory bandwidth. It can be affected by the RAM itself (memory frequency, dual-channel vs single-channel) or by the path between the CPU and memory (mentioned FSB).

seibert · September 15, 2009, 6:15pm

1000 MB/sec and 1900 MB/sec seem pretty slow for pinned memory transfer rates. That’s not too crazy for pageable memory, as the CPU and memory speed make a much bigger difference there. To copy data from memory that is not pinned by the OS, the NVIDIA driver has to copy your data buffer in chunks to private memory location that is pinned, and then instruct the GPU to DMA transfer the data over. When that chunk is finished, the process repeats until all of your data is copied. In this case, you want a fast CPU and fast RAM (because data is being copied from one location to another before being sent to the GPU).

When your application uses pinned memory, the performance is usually much better because the GPU can directly grab your data without the overhead of copying it to a private pinned region first. I think the only platform that can copy data to the GPU from pinned and pageable memory at the same rate is the Core i7 with the X58 chipset. (nearly 6 GB/sec)

jcog · September 15, 2009, 6:39pm

1000 MB/sec and 1900 MB/sec seem pretty slow for pinned memory transfer rates. That’s not too crazy for pageable memory, as the CPU and memory speed make a much bigger difference there. To copy data from memory that is not pinned by the OS, the NVIDIA driver has to copy your data buffer in chunks to private memory location that is pinned, and then instruct the GPU to DMA transfer the data over. When that chunk is finished, the process repeats until all of your data is copied. In this case, you want a fast CPU and fast RAM (because data is being copied from one location to another before being sent to the GPU).

When your application uses pinned memory, the performance is usually much better because the GPU can directly grab your data without the overhead of copying it to a private pinned region first. I think the only platform that can copy data to the GPU from pinned and pageable memory at the same rate is the Core i7 with the X58 chipset. (nearly 6 GB/sec)

Thanks for all the input guys, I’m learning quite a bit today :)

I was incorrect, the Nvidia SDK program only runs the test for pageable memory, not pinned, my mistake. So if I understand you correctly then the speed difference can be attributed to the 1900 MB/sec computer having a faster processor? I believe the RAM is the same in each machine, but the better machine has 2 gigs of RAM to the slower machines 1. Would having more RAM also contribute to a situation like this or just the speed of the RAM you do have?

seibert · September 15, 2009, 7:06pm

I’m not sure if memory size is a factor, actually. I’ve never compared two systems differing only in memory size.

mikeheck · September 15, 2009, 9:07pm

Run it from a command line and add the “-p” option to test with pinned memory.

You should see a dramatic improvement on a machine with PCI-E 2.0 hardware.

-Mike

Topic		Replies	Views
About Data transfer speed between CPU and GPU? How to increase the data transfer speed? CUDA Programming and Performance	7	15632	December 11, 2009
The speed of data transfer between GPU and CPU CUDA Programming and Performance	4	2753	April 27, 2009
Data Transfers Optimization aka Pinned Host Memory utilization CUDA Programming and Performance	6	666	December 17, 2021
Memory copy speed CUDA Programming and Performance	3	4478	April 2, 2009
Host2Device bandwidth, Kepler VS Fermi CUDA Programming and Performance	4	2167	July 2, 2012
CudaMemcpy() speed/bandwidth For host to device CUDA Programming and Performance	5	10107	June 30, 2009
Optimize data transfer rate from host to device CUDA Programming and Performance	3	2898	July 27, 2017
bandwidthtest: pageable vs pinned memory CUDA Programming and Performance	4	1736	February 18, 2010
Why i can't use my full PCI-Express bandwidth? CUDA Programming and Performance	7	5244	December 17, 2020
Bad PCIe transfer performance (cudaMemcpy), what can cause that? CUDA Programming and Performance	10	11716	September 20, 2010

What factors effect GPU transfer speed?

Related topics