I finally made the MATLAB plug-in work on my computer. But the result is disappointing: CUDA produces little acceleration of FFT computation, as shown below:
============================================
---- Run native Matlab simulations ----
which Szeta
C:\Matlab_CUDA-1.1a\Szeta.mexw64
tic; FS_2Dturb(128,1,1,1); toc;
CFL = 0.1017
Gsqav = 1.1995
Elapsed time is 4.856168 seconds.
tic; FS_vortex; toc;
ans = 512
Elapsed time is 24.564927 seconds.
---- Compile the CUDA source and rerun the simulations with acceleration ----
nvmex -f nvmexopts.bat Szeta.cu -IC:\cuda\include -LC:\cuda\lib64 -lcufft -lcudart
abdelali target arch: win64
Szeta.cu
tmpxft_00000dac_00000000-3_Szeta.cudafe1.gpu
tmpxft_00000dac_00000000-8_Szeta.cudafe2.gpu
tmpxft_00000dac_00000000-3_Szeta.cudafe1.cpp
which Szeta
C:\Matlab_CUDA-1.1a\Szeta.mexw64
CFL = 0.1017
Gsqav = 1.1995
Elapsed time is 4.834929 seconds.
tic; FS_vortex; toc;
ans = 512
Elapsed time is 23.179676 seconds.
The improvement – if there is any – is imperceptible (<6% ). The computing environment is as follow:
Windows 7 pro 64bit,
Matlab2009a,
VS2008pro,
CUDA 2.3 (driver: cudadriver_2.3_winvista_64_190.38_general.exe)
Dell Precision workstation (CPU: Intel Xeon 3.3G)
Quadro FX 3800
Any suggestions on how to make CUDA work better?