Different results from printf and cuda-gdb


I’m probing register c[0], c[1], c[2], c[3] in a CUDA program.
The strange thing is that the results shown in CUDA-GDB is correct, but the results from printf is wrong. The program finally write back wrong results.
I attach my code here, you can run it with:
cuda.zip (6.0 KB)
nvcc -o mma_sp runner_mma_sp.cu -arch sm_80 -Xcompiler -fopenmp
Can anyone tell me what’s going on? Thanks in advance.

cuda-gdb version:12.6

The problem is solved, see:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.