Hi, I read that a GeForce 8400 GS PCI has 2 stream processors (I read this in ‘NVIDIA_CUDA_Programming_Guide_2.0.pdf’, see
Does this mean when I buy a GeForce 260GTX with 24 processors my app will run 12 times at fast as it did
with the GeForce 8400 GS?
If not, what do you think what could be the factor?
well, there are several things that change from a 8400gs pci to a gtx260 pci-e…
first, start with the host-device bandwidth:
with pci you get, uhm, 133 MB/s or something like that.
if you have a gtx260 you can get as much as 5.5GB/s if you are running e.g. on an x58 mainboard. but also if you are running on an old board with pcie gen.1 and DDR1-RAM (i have an older workstation with FB-DDR1 and pcie), you’ll get around 1.2 to 1.5 GB/s.
so your speedup for memory transfers will be something like 9x-42x.
now lets go on with the internal memory bandwidth:
a 8400gs only has a 64Bit bus and 400 MHz mem clock. so you’ll get 6.25GB/s.
a gtx260 has a 448Bit bus and 1000 MHz mem clock. so you’ll get 109.375GB/s.
that’s a speedup of 17.5x.
next, the number and clock of the stream processors:
the 8400gs has 2 processors at 900 MHz.
the gtx260 has 24, the gtx260 core 216 edition has 27 sps. both are clocked at 1242 MHz.
That’s a speedup of 16.5x-18.6x.
the last difference is te architecture itself. take a look at the compute capabilities and their differences to see if this will get you an additional speedup.
so, switching from an 8400gs to a gtx260 will give you anything between 9x and 42x depending on your other system components and your kernel.
you have to analyse, what your kernel does to know more about how it will scale.
Ok, 9x to 42x. Thanks a lot for your detailed listing :)