As the reference book says that the limit of the 1D transform of CUFFT is 8 million elements.
But when I tried to transform 30 million elements,I can also get the right result (compared with computing by matlab and FFTW3).The only problem is a “CUFFT_INVALID_PLAN” error.At this circumstance,the speedup is about 28 times.
When I divide the data into 6 batchs ,I can compute them without warnings ,but the time speedup is only 7 times.
So can I compute the data of 30 million elements in a batch?