in my cuda program, there is a kernel function including a “for loop” for float computations.
In general, the for loop iterates many times such as 2^20.
Actually, the exponent depends on the input data. … anyway… very big number.
By the way, my program sometimes does not execute the kernel function normally.
(the MPI version of the program always works. I mean the algorithm or something is correct.)
I have checked…
and now… I know some threads do not finish the “for loop” normally.
That is, they quit the loop suddenly. In each execution, the point where the threads stop working is different.
I wonder why this happens…
Don’t you have this kinds of experience ?
Run your program under [font=“Courier New”]cuda-memcheck[/font] to find wrong memory accesses. How long does the program run - could you be triggering the watchdog timer?
Thank you for your attention.
Following your comment, I ran my program under cuda-memcheck.
However, no error is found, even though the program quits abnormally.
By the way, I also guess some memory problem.
Because I found that accessing some memory (local memory. of array…) lets the program quit.
hmm… it is very difficult…
Sounds like a watchdog timer timeout.
dear tera and Keldor314.
Thank you very much.
The problem was by watchdog timer.
Because this is my first cuda programming, especially for big data, I have not been this kinds of GPU programming problem.
To be honest, I didn’t know “wathdog timer”, before your comments.
Now, my program is working well.
I really appreciate your attention and comments.