Didnt look at the example but yes you r right they made a mistake there. They even passed the plan by value instead by reference and omitted the 2nd parameter giving the rank.
Now that I solved that part and cufftPLanMany is working, I cannot get cufftExecZ2Z to run successfully except when the BATCH number is 1. This is far from the 27000 batch number I need. I get a ULF if BATCH > 1.
Is there a trick to get it to run?
Edit : The rank could be wrong here… I will check that first .
Correct me if Im wrong but for a batch size of 27000 and a size of 128x256 per transform you would need 128 * 256 * 27000 * 4 bytes. This sums to a total of a bit more than 3 gigs stored in one array. You have a C2070 for this?
Ok,
I got this part working but I found another problem.
The function cufftExecZ2Z does not give the same answer as the equivalent FFTW3 function.
For the exactly same input array, the first few output elements are shifted by 2 positions and after around 50 elements, the signs seems to be reverse at least for the real part.
This is for a Plan3d (30,30,30) transform.
Was this function ever check against FFTW3 ?
I can show some plots if required.
Edit 1 : Because the plan and the exec are done is the same call and in-place transform is required, I was not aware that the transform was done twice each time I called the function. All seems fine now.
Ok,
I got this part working but I found another problem.
The function cufftExecZ2Z does not give the same answer as the equivalent FFTW3 function.
For the exactly same input array, the first few output elements are shifted by 2 positions and after around 50 elements, the signs seems to be reverse at least for the real part.
This is for a Plan3d (30,30,30) transform.
Was this function ever check against FFTW3 ?
I can show some plots if required.
Edit 1 : Because the plan and the exec are done is the same call and in-place transform is required, I was not aware that the transform was done twice each time I called the function. All seems fine now.
I would like to replace NULL,1 ,0 ,NULL, 1,0 with their FFTW3 equivalent. I will look if I can make all the data contiguous in the mean time.
EDIT:I would like to confirm something.
If I have an array 2X2X2 defined in fortran and I linearize the array to be 1D , then it should not matter when I use cufftPlan if the input array is defined in C or fortran ,right?
But, given that cufftPlanMany does not have stride implemented, if I modify the 1D input array to represent the ‘strided’ array , should I take into account that this array is defined in fortran and modify the sequence before getting it to cufftPlanMany?
I would like to replace NULL,1 ,0 ,NULL, 1,0 with their FFTW3 equivalent. I will look if I can make all the data contiguous in the mean time.
EDIT:I would like to confirm something.
If I have an array 2X2X2 defined in fortran and I linearize the array to be 1D , then it should not matter when I use cufftPlan if the input array is defined in C or fortran ,right?
But, given that cufftPlanMany does not have stride implemented, if I modify the 1D input array to represent the ‘strided’ array , should I take into account that this array is defined in fortran and modify the sequence before getting it to cufftPlanMany?