I wrote a neural network implementation in Cuda.
In debug mode with nvcc compiler option -G (generate debug info), everything works fine, neural network trains to good results.
But debug build takes a long time of course.
When I switch this option (generate debug information) off, then neural network training gives no results- it does not learn - meaning that calculations give a different results.
floating-point options for debugging and release builds are the same in the project and I also set this options for both types of build
–ftz=false --prec-div=true --prec-sqrt=true
What can be a problem here?