I am attempting to write several cufftXt transforms in a loop and am only transforming certain variables forward and certain variables back using in place C2C transforms distributed on multiple GPUs. After transforming the variables, I place them in natural order using cufftXtMemcpy(CUDA_DEVICE_TO_DEVICE). Then after transforming, the variables are manipulated in k-space and transformed back into real space.
The real space variables are then stored in the previously transformed variables and the complex part of the float2 type variable are set equal to 0. However, because of this, when attempting to use the same variable for returning to natural order, I get an error. I am assuming this is because I am attempting to transform a “complex” variable, but was made real by me setting the “complex, .y” part of the float2 = 0, rather than needlessly transforming back.
The work flow is below:
- forward in-place transform three different variables →
- return k-space output of all variables to natural order →
- manipulate in k-space by multiplying with a kernel (which is why the k-space variables are placed in natural order) →
- store the output of the manipulation in one of the three transformed variables →
- inverse in-place transform of the one variable →
- return real-space output to natural order →
- manipulations in real space saved to the other two variables and the “complex” part set to zero →
- forward in-place transform three different variables →
- return k-space output of all variables to natural order (this time I get an error cuFFT_INTERNAL_ERROR)
I attribute the error to not transforming the two variables using CUFFT_INVERSE. However, is there some field in the cufftXtLibDesc struct that can be changed without needlessly transforming back into real-space? Additionally, is there someway to “reset” the struct?
Additionally, what are the fields in the cufftXtLibDesc structs? I can find very little information.
Alternatively, if I know how cufft permutes the data, all of these steps could be avoided by permuting the k-space kernel in the appropriate pattern.