questions about CUFFT usage belong on this forum. This possibly related topic discusses that the CUFFT team is/was aware of issues and changes in CUFFT plan creation. It seems evident from the description there that CUFFT plan creation (now) may also cause module loading. This is consistent with a general trend in CUDA towards lazy loading, which have a variety of reasons that support the idea, but is not without some associated issues.
Certainly plan reuse is a good option. Also, as described there, cufftDestroy can cause a situation where module reloading takes place at the next plan creation, therefore as suggested there, another option to consider is storing all your plans in a vector and not destroying them until performance is no longer a concern. Obviously that will have some limits as well, from a workaround perspective. You would not want to store a vector of trillions of plans.
The plan expects a certain size. You can reuse a plan on a smaller size if the data set is padded to the size the plan expects. Padding of FFT data is a common scenario (in my view) but may not fit your needs. It will require you to pad the data and it will also affect the output numerically.
Unless there is some objection, I’ll plan to move this topic over to the other forum I referenced, shortly.