Quick question.
I’m considering buying the GTX590 but I was just wondering, if I were to run a CUDA program, if it will recognize the GTX590 as a single device or as two devices.
I understand that the GTX590 is like the GTX580 combined into one (with less processor clock) but for coding, to enjoy the full power of 1024 cores, would I need to code it like I would be dealing with multi-GPUs or would I just need to code it like I would be running it on a single GPU.
It’s just like two separate GPUs, unfortunately. That’s what the GTX590 physically is: two devices on one board, with a little extra logic to let them share the PCIE bus. This is still great hardware, but it does mean you need the extra coding to deal with multiple GPUs and do your work partitioning and synchonization with that in mind.
The good news is that this is exactly the same coding you need to do for multi-GPU support in general, so once you have that done, you can use any combination of GTX580s and GTX590s together or separate or whatever, and your code just sees them as more GPUs.
The confusion with the GTX590 (and the previous GTX295) is that to gamers, this is all abstracted away from them, so the marketing can say “twice the graphics horsepower!” and it’s true for the gamers (and to a reasonable extent, the game programmers, who usually are abstracted behind the graphics API like DirectX). But CUDA is different and gives you the GPUs for you to coordinate yourself.
With my limited budget (just enough to buy one GTX590 or two GTX580s), I might be better off just buying two GTX580s; as I’ve heard the benchmark is slightly higher because of the difference in processor clock. And then code my code for multi-GPUs.
If your code is destined to run on different nVidia GPU, and not only your own computer, including multi-gpu configurations and non-balanced configuration (ie: 9400M + 9600M GT), or even older nVidia GPU (pre-Fermi 1.x compute capability), I would recommend taking a fast GTX 580 and a pre-Fermi (such as GTX 280).
You will be able to do multi-gpu as well, but with the necessity to load-balance your kernels between the 2 GPU.
Also you may try to optimize for Fermi and pre-Fermi GPU as well.
I sue a MacBook Pro with 9400M + 9600M GT specially for the purpose of multi-gpu programming and load-balancing work to be able to use as much as I can on any configuration, and use now a GTX 570 on my PC to optimize for Fermi.