I’m currently building a n-body simulation with CUDA, and am encountering a weird problem. One of the steps of the simulation involves copying data out of the GPU with cudaMemcpy(), and this data is in the form of 3 1D arrays. It works correctly when debugged using NSight, but when I build and run the project using MSVS’s built-in stuff, it gives me error 700 (illegal memory access) for every cudaMemcpy call. Any ideas on why this happens?
Just in case I can’t solve this, is there a way to run the program completely without breakpoints with NSight? As of now it automatically creates a breakpoint before any kernel, and I can’t figure out how to turn it off.