Running program standalone works fine but using cuda-memcheck gives errors


I tried using cuda-memcheck to test my program, it gives me many errors and no correct outputs.

But if I just run my program standalone as usual, everything seems fine and the output is always correct.

Can you please tell me why?


With no other facts or code, we can only guess.

But the best guess is your code is buggy. It’s especially common for a race bug to show up when you turn on debugging or memcheck since those usually affect instruction and/or block scheduling, exposing the bug.