Hi Max, thanks for posting this issue. As the other users have mentioned, this symptom typically occurs when debugging information has not been generated at compile time. However, from the link I can see that you have already tried a variety of compiler options and have hit the same problem. Also, I can see that you have run several experiments here, but we’re likely going to need some more information.
(1) When you stop at the false breakpoint at the end of the kernel, can you run the ‘info shared’ command and post the output here? This will let us know if symbols have been properly loaded from the cuda library.
(2) Can you run nvidia-bug-report.sh and send the results to firstname.lastname@example.org? This will let us know exactly which GPU(s) you have installed in your system and what the setup is.
(3) Are you using the TMPDIR environment variable, or is your temporary directory in /tmp on this machine? Can you run ‘ls -la /tmp’ and post the results?