Ok I’m sorry for such a beginner question, but when you all talk about ‘bandwidth’, where is it from and where to?
From host to device (through the PCI/AGP/PCIe bus) or how much memory per second can be read from the graphics card memory
into GPU registers or what?
Why is it so important to put much memory through the PCIe bus, I thought the copying host to device/device to host should be done
there are two kinds of “bandwidth” you’ll find:
the device-device (i.e. within a graphics card) bandwidth (on gt200 around 100GB/s, theoretically) is the one that will most likely be the bottleneck on kernel execution.
and then there is the device-host or host-device bandwidth (1 to 5 GB/s) over pcie (or pci/agp, only then quite a bit slower), which is used when transferring your data between gpu and cpu.
as you already pointed out, transferring data between gpu und cpu should be done rarely, but in some cases, especially if you are working with more than one graphics card, you might have to use this bus a lot. then these memory transfers account for a big part of your execution time and the faster you can get them done, the better.