Hi, I am encountering difficulties when running various configurations of problem size in CUFFT. The application is the simple example provided by Nvidia. I use the in-place version of the transform.
In the given logs, all the tested combinations are given with their diffs (no diff is good, i.e. 0.0000). Unfortunately, sometimes the result is wrong (there is some diffs).
This test over all the combinations as been run several times. When running the big test (re-running all the combinations) the errors don’t appear at the same place! :blink:
For example, the first result is Ok, the second one is not:
NX=1024 NY=2048
0
0
tmp = nan, max = 0.000000
NX=1024 NY=4096
0
0
0 0 4194304.000000 4177920.000000 16384.000000
1024 31 0.000000 -19758.222656 19758.222656
1024 55 0.000000 -22357.396484 22357.396484
tmp = nan, max = 22357.396484
Ok let’s check the context :ph34r:
For several of them, the output data differs between the executions. I give 2 logs, corresponding with the two runs of the entire problem size combinations on:
- the same machine
- same card (GTX295 / dual GPU)
- same driver (185.12)
- same SDK (cuda 2.1)
- same application (a.out)
- same input data (the matrix is initialized with 1.0)
Please note that the problem is also encountered on another systems (with the same overall non-deterministic behaviour) like:
- card: Tesla C1060
- driver: 180.22
- SDK: cuda 2.1
This reminds me that the problem may be somewhere between keyboard and chair… External Image
I’m runnnig on Linux. I give a bash script for testing.
Please rename test.sh.txt into test.sh, this is needed because of the “Upload failed. You are not permitted to upload this type of file” message on the forum). The same way, rename Makefile.txt, fft.cu.txt etc.
The problem appears on big problem sizes (1024+).
It is like some blocks where not computed. (4 of them I think)
Any idea ? :rolleyes:
Makefile.txt (313 Bytes)
fft.cu.txt (3.52 KB)
test.sh.txt (598 Bytes)
vigg_2.1_185.12_3.txt (12.4 KB)
vigg_2.1_185.12_2.txt (12.5 KB)