It has a bunch of fft, ifft, conj, etc in the script.
How to configure/compile Octave to use CUDA capabilities specifically for this xcorr function? Maybe using cuFFT somehow? Which gpu-accelerated library is the best for this function?
I am using Octave to calculate two-dimensional (large) fields. That uses a lot of FFTs and IFFTs. To fasten the calculation I tried the CUDA-libraries. cuBLAS (from CUDA V8.0) is working fine but has no optimisation for the FFT-Routines. The video under “Drop-in Acceleration on GPUs with Libraries” - “Learning Libraries” shows how to use the CUDA-FFT instead of FFTW. I managed to install the octave sources (V. 4.0.3) and to compile a new binary that workes well. Unfortunately I am not so familiar with coding under Linux that I managed to replace the FFTW with the CUFFT. Has anyone made this and can provide me with the changed sources/files or a “howto”? That would be very helpful!
Thanks in advance!