I have FORTRAN code.
I marked it with ACC directives. Launching application result in
call to cuMemFree returned error 700: Launch failed
cuda-memcheck shows huge amount of errors like
========= Invalid __global__ read of size 4 ========= at 0x0003cb58 in mp_thompson_837_gpu ========= by thread (56,0,0) in block (0,29,0) ========= Address 0x0c42c9fc is out of bounds ========= Saved host backtrace up to driver entry point at kernel launch time ========= Host Frame:/usr/lib/libcuda.so (cuLaunchKernel + 0x34b) [0x54b6b] ...
I failed to debug my application with cuda-gdb.
Is there any correct way to debug such application (acc kernel)?
boundary check test passed OK.