CUFFT taking longer on some data than others


I have an FFT code that I use for some image processing. I only create the plan one time, and reuse it throughout my code.

I have put in some timing metrics, and even though all of the FFT’s are called the same way, but with different data, there is some time variance that is significant:
Here is some output from my code:

Convolver FFT time: 0.47999999
Den IFFT time: 0.47000000
Den FFT time: 0.51999998
Upsize kernel FFT time: 1.79999995
Test_filt FFT time: 0.04000000

As you can see, that 4th FFT call is more than 2-3 times longer than my other FFT calls, which all are doing the same amount of work (an inverse shift, the FFT, followed by a forward shift to get the DC component in the right place).

What could be causing that to peak so badly for that one FFT call?

And speaking on this topic - arent all of these times a little excessive? I timed the FFTs individually, and the shifting, which was negligible, but these times are for a 1080x1920 image. I would have thought he FFT time itself to be in the range of a few hundreths of a second.