Multi GPU programming: SLI or GPU-direct2.0? GPU CUDA

I want a desktop equipped with a GeForce GTX 460/560 for developing some scientific computing applications using CUDA.
Right now, I wish to buy only one card. But in the near future I might need to purchase another card to do multi-GPU CUDA
programming.

Hence I have 3 questions:

1)Does this mean I have to go in for a SLI equipped motherboard?
2)Is the SLI technology used in multi-GPU CUDA programming? D
3)If not, then do I just have to purchase another GTX 460/560, fix it into appropriate PCIe slots and install the GPUDIRECT 2.0 drivers to start using multi-GPU CUDA?

Being a CUDA /GPU noob I would really appreciate some simplfied/detailed explanations.

Thank you

[list=1]

No. SLI has nothing to do with CUDA

See above

There is no such thing as GPUDIRECT 2.0 drivers, peer-to-peer access is supported on Fermi cards in CUDA 4.0 without requiring anything. However, the GPUs used for peer-to-peer access must be the “same”. That means you probably need two GTX560 or two GTX460. The mixed case probably won’t work, although I have not tried it with GF100/GF110 or GF104/GF114 combinations to be able to say conclusively whether that is the case.

Irrespective of the finer details of (3), GPUDIRECT isn’t necessary for multi-gpu CUDA programming, and it doesn’t automagically make your code run on multiple cards, if that is what you were thinking.

As an additional note (I agree with everything avidday said) make sure the devices you want to use for peer-to-peer access are in the same PCIe domain. Otherwise this will not work.