cufft on Tesla D870 using 3GB? I wonder... ;)

Hi,

I need to run ffts on extremely long stretches of data (~3GB).
I just recently realized that the Tesla D870’s 3GB memory is basically divided into two C870s 1.5GB each, so technically I cannot run an fft on data stretches bigger than the capacity of each individual card (1.5GB).
I was wandering if there exists some implementation of cufft that uses multiple devices for computing? Is this even possible?

Thanks!

Y.

You might want to consider upgrading your hardware to the C1060 or S1070, where each GPU has 4GB dedicated to it.