In my machine, CUDA Driver Version: 4.2 CUDA Capability Major/Minor version number: 2.1
I used cuda-memcheck to detect kernel function error and it shows:
CUDA error at gbh.cu:266: unspecified launch failure
========= Invalid global read of size 8
========= at 0x00003068 in edges.h:110:gb
========= by thread (0,0,0) in block (0,0,0)
========= Address 0x00d745f0 is out of bounds
========= Invalid global read of size 8
========= at 0x00003068 in edges.h:110:gb
========= by thread (0,0,0) in block (1,0,0)
========= Address 0x00d74690 is out of bounds
========= Invalid global read of size 8
========= at 0x00003068 in edges.h:110:gb
========= by thread (0,0,0) in block (2,0,0)
========= Address 0x00d74730 is out of bounds
========= Invalid global read of size 8
========= at 0x00003068 in edges.h:110:gb
========= by thread (0,0,0) in block (3,0,0)
========= Address 0x00d747d0 is out of bounds
========= ERROR SUMMARY: 4 errors
But I can access the address with printf before executing the kernel function like this:
-----------------The first smooth_r address is 0xd745f0
-----------------The second smooth_r address is 0xd74690
-----------------The third smooth_r address is 0xd74730
-----------------The fourth smooth_r address is 0xd747d0
It seems contradictable to the cuda-memcheck error.
What’s the reason?
Thanks for your time.