Usually it’s “Please find the problem”. In my case I know the problem but I don’t know how to build an error-check which shows me, that there is a problem. Consider the following code:
It is obvious, that the kernel tries to access an element, which is not malloced before. So the expected result is something like “bad memory access” or something like this. But on my system (CUDA 11.4, RTX2070 with 471.11, VS2019) I get cudaSuccess four times.
Did I miss something obvious?
Runtime detection of out-of-bounds accesses is not accurate to any specific boundary. If you access far enough out-of-bounds, you will trigger a runtime error (for example, change your 1023 to a very large number), but there are no guarantees that any out-of-bounds access will always trigger a runtime error. As far as I know the same principle is true for host code.
Just as a simple example, when I change the 1023 to 10230000, I get a runtime out-of-bounds error detected.
To be assured to catch such errors, run your code with compute-sanitizer.
there are no guarantees that any out-of-bounds access will always trigger a runtime error. As far as I know the same principle is true for host code.
Not every out-of-bounds access represents an access violation with regard to page-based memory protection schemes. Tools like valgrind can detect pretty much every out-of-bounds access (at considerable cost in runtime).