I’ve created CUDA-enabled software which performs some matrix operations in parallel. It works fine. The problem is starting from certain size of the problem I’m always getting the same error - code 77 which is CudaErrorIllegalAccess I suppose. The code is quite complicated so there’s certainly no worth in pasting it here - just tell me what do you want to know and I’ll explain or paste it here. The awkward thing is the algoritm is very generic and performs the very same operations - just on larger datasets. This data isn’t that big. It generates only about 4% of GPU load (CPU load is quite high - 99%) and mem occupancy is only 12mb.
I know at this stage is like guessing but what would be your suggestions at this point? What to check? I tired to analyze that cuda performance reports in visual studio but I haven’t notice anything awkward. I’m attaching one for the largest working problem size and the one the algorithm starts failing on.
lastWorking.rar (1.6 MB)
firstFailing.rar (988 KB)