I need to benchmark the cufft performance. I find that if I do only one 1024FFT, the time is around 70us, but if I loop the FFT for 100 or 1000 times, the average time per FFT is around 10-12 us.
I learnt from others’ benchmarking results: time per FFT should around 1 or less us.
Can you help to solve my problem if you know them? Thanks a lot.