Release Mode No values

I am having an issue when i run the program in debug, it acts normally, i get all my return values and everything is happy. When i run in release, i get no more values, they are all either 0 or not returned i cant really tell cause i cant put a break point. I have the fmad=false already and i changed all my data types to not include floats. I am running cuda toolkit 11.0 and visual studio 2019 c++

this is the signature of the method i am calling

 double* ForceGPU = new double[4];
 ForceGPU = ObliqueCutMaterialModel_ComputeForce_GPU( feed,  speed,  rake,  backrake,  hardnessRatio, m_SpeedLevelsLocal,  numfeed, m_FeedLevelsLocal,  numspeed,
	 m_RakeLevelsLocal, m_ChipTableLocal,  numrake, m_BackRakeLevelsLocal,  numbackrake);

Again all great in debug. nothing in release. Please help!!!

There is a very high likelihood that there is a bug in your code. What happens if you run your program under the control of cuda-memcheck? If it reports any issues, you would want to fix those.

at the moment, im not even launching a kernel. Im simply running the cpp code in the .cu file just to ensure its running. And as i mentioned, when it runs in debug it all works fine, no errors, switching to release I get nothing back or zeros. I wanted to break it into small chunks. So I have the code running in the .cu file but id doesnt launch a kernel and pass to the GPU. so in this case there shouldnt really be any memory issues i can see of?

So you have a bunch of C++ code compiled for the host and nothing running on the GPU yet. “Works in debug but fails in release” unfortunately does not tell us anything other than that, with high likelihood, there is a bug in the code.

Use your preferred standard debugging methods to track down the root cause. I myself like to debug by instrumenting code so it creates a log when I run the code.

it actually turned out to be a compiler optimization issue. When i have the compiler optimization in the cuda/c++ host turned to disabled, it get the same results from both debug and release. It was not a code bug. But i do like the idea of writing a log when in release, that can help moving forwarded.

There is something else that I am seeing that maybe you can lend some insight onto. we have the previous iteration of our code back before we included the cuda toolkit and a the build customization. not changing any code, but just the introduction of the ncvv compiler into the mix, our calculations change. Im not sure how this would be possible unless there is some hybrid compiling happening where maybe .cpp files are being compiled with nvcc? I have not idea how this could happen. Do you have any ideas?

Compiler bugs do occur, but the CUDA compiler is quite mature so a programmer is unlikely to encounter one. If this were my code and I were to find severe functional differences between debug and release builds, I would vigorously investigate until I have established the precise root cause. The results of such an exercise are usually very instructive, e.g. inadvertent reliance on undefined C++ behavior, subtle race conditions, uninitialized data, etc.

I have encountered quite a bit of code over the years (including my own!) that was broken but just “happened to work” in a particular environment. A change in the environment, such as using a different compiler or changing compiler switches then exposed existing breakage.

I cannot speculate on issues with unknown code. Generally speaking, by observing the intermediate files produced by the CUDA tool chain, the CUDA tool chain appears to completely parse and re-write the host portion of code in a .cu file before sending it to the host compiler. The occasional functional problem I have seen with that is when machine-specific intrinsics are used (that are, by definition, not part of C++). In those cases it helps to isolate that portion of the host code in separate .cpp files so it is compiled by the host compiler verbatim.