CUTTF bug problem with inversion of the 3d fourier transform

ramarromarrone · October 11, 2010, 12:46pm

Good day,

i’m implementing in ANSI C some algorithms to reconstruct volumes from from 2d TAC projections in parallel beam and cone beam geometry.
In the implementation of the projection slice theorem (or fourier slice theorem) to invert the 3d fourier transform i use the CUFFT library to run the algorithm on GPU and FFTW to run the algorithm on CPU
and the i visualize the results with MATLAB.

working with volumes 300300300 (27000000 Voxels) and 350350350 (42875000 Voxels) there aren’t problems.

with bigger volumes 380380380 (54872000 Voxels) you can see from image posted below (Confronto_Ric_Vol_380x380x380_900pr.jpeg) that the reconstruction with cufft (3rd line)
presents some artifacts whereas the reconstruction with FFTW is ok (2nd line).

when i try with volumes 400400400 (64000000 Voxels) the reconstruction with CUFFT returns a volume totally set to zero except in the middle as you can see in the second image posted (Ric_vol_FourierSliceCUFFT_400.jpeg).

i suspect that this issue is caused by the fact that my GPU works only with single-precision floating point whereas FFTW on CPU works with double-precision floating point but i would like to have a more precise answer by NVIDIA, if is possible.

My system is:
Ubuntu 10.04 64bit, CUDA 3.1.1, dev driver 260.24, AthlonX2 7850, 4GB RAM, GeForce250 GTS 1GB VRAM.

Thank’s a lot,
Andrea Dossi

ramarromarrone · October 11, 2010, 12:46pm

Good day,

i’m implementing in ANSI C some algorithms to reconstruct volumes from from 2d TAC projections in parallel beam and cone beam geometry.
In the implementation of the projection slice theorem (or fourier slice theorem) to invert the 3d fourier transform i use the CUFFT library to run the algorithm on GPU and FFTW to run the algorithm on CPU
and the i visualize the results with MATLAB.

working with volumes 300300300 (27000000 Voxels) and 350350350 (42875000 Voxels) there aren’t problems.

with bigger volumes 380380380 (54872000 Voxels) you can see from image posted below (Confronto_Ric_Vol_380x380x380_900pr.jpeg) that the reconstruction with cufft (3rd line)
presents some artifacts whereas the reconstruction with FFTW is ok (2nd line).

when i try with volumes 400400400 (64000000 Voxels) the reconstruction with CUFFT returns a volume totally set to zero except in the middle as you can see in the second image posted (Ric_vol_FourierSliceCUFFT_400.jpeg).

i suspect that this issue is caused by the fact that my GPU works only with single-precision floating point whereas FFTW on CPU works with double-precision floating point but i would like to have a more precise answer by NVIDIA, if is possible.

My system is:
Ubuntu 10.04 64bit, CUDA 3.1.1, dev driver 260.24, AthlonX2 7850, 4GB RAM, GeForce250 GTS 1GB VRAM.

Thank’s a lot,
Andrea Dossi

Cliff_Woolley · October 11, 2010, 7:34pm

It sounds like you might be running out of GPU memory. Are you checking the return codes from cufftExec() to be sure it returns success?

Thanks,

Cliff

Cliff_Woolley · October 11, 2010, 7:34pm

It sounds like you might be running out of GPU memory. Are you checking the return codes from cufftExec() to be sure it returns success?

Thanks,

Cliff

ramarromarrone · October 12, 2010, 9:01am

when i check the return value of cufftExec trying to reconstruct a volume 400400400 voxel it is CUFFT_EXEC_FAILED.

ok, but this answer only to my 2nd question because cufftExec with a volume 380380380 returns CUFFT_SUCCESS but

there are al lot of artifacts in the reconstruction using CUFFT that are not prensent in the reconstruction using FFTW , as you can see in the 1st image posted and in this new image (much better).

in these reconstructions no filters are applied, the only difference in the algorithms is the library used to calculate the inverse transform.

thanks a lot for your patience.

ramarromarrone · October 12, 2010, 9:01am

when i check the return value of cufftExec trying to reconstruct a volume 400400400 voxel it is CUFFT_EXEC_FAILED.

ok, but this answer only to my 2nd question because cufftExec with a volume 380380380 returns CUFFT_SUCCESS but

there are al lot of artifacts in the reconstruction using CUFFT that are not prensent in the reconstruction using FFTW , as you can see in the 1st image posted and in this new image (much better).

in these reconstructions no filters are applied, the only difference in the algorithms is the library used to calculate the inverse transform.

thanks a lot for your patience.

ramarromarrone · October 13, 2010, 12:11am

up?

ramarromarrone · October 13, 2010, 12:11am

up?

ramarromarrone · October 13, 2010, 10:49pm

Nobody from NVIDIA could tell me what cause (or could cause) these artifacts with volumes bigger than 350350350?

cufftExec returns “CUFFT_SUCCESS” so i expect that the result is correct. but it isn’t.

probably this issue is related to the fact that my vga supports only single-precision floating point but i would like a confirmation from NVIDIA
because it’s very important for my thesis and i think it’s should be reported in the library documentation.

thanks,
Andrea Dossi

ramarromarrone · October 13, 2010, 10:49pm

Nobody from NVIDIA could tell me what cause (or could cause) these artifacts with volumes bigger than 350350350?

cufftExec returns “CUFFT_SUCCESS” so i expect that the result is correct. but it isn’t.

probably this issue is related to the fact that my vga supports only single-precision floating point but i would like a confirmation from NVIDIA
because it’s very important for my thesis and i think it’s should be reported in the library documentation.

thanks,
Andrea Dossi

eelsen · October 14, 2010, 12:32am

Just a guess…but the prime factors of

300 are 2, 2, 3, 5, 5

350 are 2, 5, 5, 7

380 are 2, 2, 5, 19

Depending on the implementation that fairly large prime factor (19) could be causing accuracy issues with that transform. Do you see similar problems with a 2D or 1D transform of that dimension? 378 and especially 384 have a much nicer set of prime factors.

eelsen · October 14, 2010, 12:32am

Just a guess…but the prime factors of

300 are 2, 2, 3, 5, 5

350 are 2, 5, 5, 7

380 are 2, 2, 5, 19

Depending on the implementation that fairly large prime factor (19) could be causing accuracy issues with that transform. Do you see similar problems with a 2D or 1D transform of that dimension? 378 and especially 384 have a much nicer set of prime factors.

Cliff_Woolley · October 14, 2010, 12:35am

This is likely a very good guess at what’s going on.

For this reason, actually, I was just about to ask if you’ve (ramarromarrone) tried with CUFFT 3.2RC? The accuracy of these sizes that are not radix 2, 3, 5, or 7 should be greatly improved in 3.2.

Thanks,

Cliff

Cliff_Woolley · October 14, 2010, 12:35am

This is likely a very good guess at what’s going on.

For this reason, actually, I was just about to ask if you’ve (ramarromarrone) tried with CUFFT 3.2RC? The accuracy of these sizes that are not radix 2, 3, 5, or 7 should be greatly improved in 3.2.

Thanks,

Cliff

ramarromarrone · October 14, 2010, 10:17am

i followed your advices and i tried with volumes 378378378 and 384384384.
now it works very well (with cuda 3.1!!! **), in image CUDA_3.1.jpeg you can see that the artifact compares only in volume 380380380.
image Ric_vol_FS_CUFFT_384_900pr.jpeg shows that the entire volume is well reconstructed.

my problem is resolved, so thank you very very much for your support!

** PS:
i also tried cuda toolkit 3.2 RC and it works very badly as you can see in image CUDA_RC3.2.jpeg
i know it’s a release candidate but results are terrible! i had to downgrade to cuda 3.1.

ramarromarrone · October 14, 2010, 10:17am

i followed your advices and i tried with volumes 378378378 and 384384384.
now it works very well (with cuda 3.1!!! **), in image CUDA_3.1.jpeg you can see that the artifact compares only in volume 380380380.
image Ric_vol_FS_CUFFT_384_900pr.jpeg shows that the entire volume is well reconstructed.

my problem is resolved, so thank you very very much for your support!

** PS:
i also tried cuda toolkit 3.2 RC and it works very badly as you can see in image CUDA_RC3.2.jpeg
i know it’s a release candidate but results are terrible! i had to downgrade to cuda 3.1.

Cliff_Woolley · October 18, 2010, 8:37pm

I’d definitely like to be sure that the issue you saw in 3.2 RC1 is fixed up. Are you getting errors returned back from cufftExec with 3.2 RC1?

Cliff_Woolley · October 18, 2010, 8:37pm

I’d definitely like to be sure that the issue you saw in 3.2 RC1 is fixed up. Are you getting errors returned back from cufftExec with 3.2 RC1?

ramarromarrone · October 20, 2010, 1:23pm

just yesterday i re-tried cuda toolkit 3.2RC,

cufftExec returned CUFFT_SUCCESS but issues were the same both on pc and notebook.

pc configuration :

ubuntu 10.10 64bit, gcc 4.5, devdriver 260.24, eclipse with cuda plugin, AthlonX2 7850, 4gb ram, geforce gts250 1gb vram

notebook configuration:

ubuntu 10.10 64bit, gcc 4.5, devdriver 260.24, eclipse with cuda plugin, core i3 330M, 4 gb ram, geforce gt320M 1 gb vram.

For me it’s not a problem because i can work, and i actually work very well both on pc and notebook, with cuda 3.1.1 but if you tell me you need more informations about these problems with

3.2RC i can send you more images, code,etc etc…

ramarromarrone · October 20, 2010, 1:23pm

just yesterday i re-tried cuda toolkit 3.2RC,

cufftExec returned CUFFT_SUCCESS but issues were the same both on pc and notebook.

pc configuration :

ubuntu 10.10 64bit, gcc 4.5, devdriver 260.24, eclipse with cuda plugin, AthlonX2 7850, 4gb ram, geforce gts250 1gb vram

notebook configuration:

ubuntu 10.10 64bit, gcc 4.5, devdriver 260.24, eclipse with cuda plugin, core i3 330M, 4 gb ram, geforce gt320M 1 gb vram.

For me it’s not a problem because i can work, and i actually work very well both on pc and notebook, with cuda 3.1.1 but if you tell me you need more informations about these problems with

3.2RC i can send you more images, code,etc etc…

Topic		Replies	Views
cuFFT and fftw CUDA Programming and Performance	10	4299	August 25, 2010
cufft doubt comparing r2c and c2c 2D FFTs CUDA Programming and Performance	28	13736	October 27, 2010
Bad Performance of CUFFT library? compilation flags for optimizing fft performance CUDA Programming and Performance	11	13647	February 17, 2012
my speedy FFT 3x faster than CUFFT CUDA Programming and Performance	139	242309	November 16, 2011
Questions about cuFFT for 3D matrix, arrayFire GPU-Accelerated Libraries	5	1750	October 12, 2021
After installing cuda v2.3 , Can not execute 16K FFT . With release v2.3 , function after fft always CUDA Programming and Performance	8	3863	August 16, 2009
accuracy of CUFFT under double precision CUDA Programming and Performance	9	4244	September 18, 2009
NukadaFFT library CUDA Programming and Performance	128	124205	February 6, 2012
CUFFT appears to give errors for vectors > 1024 CUDA Programming and Performance	6	8844	April 12, 2007
2D CUFFT wrong result GPU-Accelerated Libraries cufft	8	3203	November 7, 2023

CUTTF bug problem with inversion of the 3d fourier transform

Related topics