I use cuFFT library with multiple gpus (Xt) to accelerate my programme. I’ve run into some issues. Unfortunatelly I cannot see any good solution. The programme simulates wave propagation in a loop that uses R2C and C2R transfomations.
First I create R2C and C2R cufftXt plans. Then I create 3 cudaLibXt descriptors using cufftXtMalloc:
- X with format INPLACE (will be initialized with data),
- Y with format INPLACE_SHUFFLED and
- Z with format INPLACE_SHUFFLED
After I copy in the initial data to X using cufftXtCopy, I begin the main loop.
First a R2C transformation is performed over X. Then a kernel that computes over complex data in X is launched and it computes new values for X, Y and Z based on X. Now I perform a C2R transfomation over X, Y and Z and use the data from all descriptors.
The problem is that after each iteration I end up with X in INPLACE format (that is ok) but Y and Z are in INPLACE format as well, but I need them in INPLACE_SHUFFLED format for next iteration. I dont need to perform a R2C transformation over them to put them in that format. I would like to just change their current state, I know all their data would be invalid then, but I would overwrite them anyway.
The solution that came into my mind is that I could allocate a new descriptor for Y and Z every iteration using INPLACE_SHUFFLED format, but that is not a good enough solution for me.
So I would like to ask if there is an other way, or if it could be possibly created, like some cufftXtSetDescriptorState function.
Thank you very much.