Hi,
I need to run ffts on extremely long stretches of data (~3GB).
I just recently realized that the Tesla D870’s 3GB memory is basically divided into two C870s 1.5GB each, so technically I cannot run an fft on data stretches bigger than the capacity of each individual card (1.5GB).
I was wandering if there exists some implementation of cufft that uses multiple devices for computing? Is this even possible?
Thanks!
Y.