I am currently in the process of converting a previously working FFTW stack running on a TX2, to cuFFT.
I got to the point where everything is linked and compiles without error, but it’s sefaulting.
When using the wrapper, do we still need to use the cumalloc for declaring memory? Or is it intelligent enough to convert from general memory, since the Tegra shares ram?
My current procedure in C++:
float input = new float[INPUT_SIZE](); Complex output = new Complex(INPUT_SIZE/2 +1); fft_planRange = fftwf_plan_dft_r2c_1d(INPUT_SIZE/2 +1, input, reinterpret_cast<fftwf_complex*>(&output), FFTW_PATIENT); ... fill buffers fftwf_execute_dft_r2c(fft_planRange, input, reinterpret_cast<fftwf_complex*>(&output));