I’ve been working on my project for the past couple of months and recently i found a bug.
- kernels work fine (i.e. produce the needed result on an array of data)
- i’ve been running my code through a data set, but apparently, from run-to-run on different machines supporting the same CC (=2.0 in my case) it produced the same results but not all of them.
Looking closer in cuda-gdb + generally analyzing output data it seems that somehow one of the kernels starts producing unexpected results after the first half of my matrix is being calculated…
any ideas on the stated issue and how to resolve that?