Strange behaviour of Cuda release build

I wrote a neural network implementation in Cuda.
In debug mode with nvcc compiler option -G (generate debug info), everything works fine, neural network trains to good results.
But debug build takes a long time of course.
When I switch this option (generate debug information) off, then neural network training gives no results- it does not learn - meaning that calculations give a different results.
floating-point options for debugging and release builds are the same in the project and I also set this options for both types of build
–ftz=false --prec-div=true --prec-sqrt=true

What can be a problem here?

Hypothesis (1): There is at least one bug in your code.
Hypothesis (2): There is a race condition in your code.
Hypothesis (3): Your code is invoking undefined behavior.
Hypothesis (4): There are missing error status checks in your code.

Thanks, njuffa
I narrowed down the problematic part and hope I’ll get some results.
there are some strange things (for me) happening in the easiest part of the code.

but, anyway its curious why I am not getting race condition in debug run, I always get same numbers in calculations.
if it is data race, then I should get different numbers each time.

That’s a misconception. If there is a race condition, you may or may not get different numbers each time, depending on temporal context. In any event, the presence of a race condition is merely a hypothesis.

1 Like

Big thanks njuffa
you were right, there was a data race condition. Which were not taking place in debug code, because results were the same with CPU code.
and now I wonder why - maybe the data race did not take place because of slow debug code? I am not sure, but I have no other idea.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.