Program works on small instances of the problem First-chance exception thrown.

Hey, i have written the following code and it works fine on input of size 10,20,30,40, but when it comes to 50 and more it throws first-chance exception. Could anyone run this program on their PC and tell me whether they get the output or it crashes as well.

Here is the program: (Press enter 3 times when running the program because i have some breaks there):

#include <iostream>

#include <fstream>

using namespace std;

file_50.txt (9.48 KB)

You are probably hitting the watchdog timer timeout. If not, run your kernel under cuda-memcheck to check for stray memory accesses. See my signature for links.

The thing is that it breaks when copying data from CPU to GPU, at least thats where the visual studio indicates. So i think, it does not even get to the kernel.

@tera, i have tried following your advice but i found nothing. I am kind of stuck, big time.

You have an out-of-bounds access in host code. You allocate [font=“Courier New”]M[/font] as an array of bool

bool* M = new bool[N];

and then try to copy from it as if it were an array of int

cudaSafe(cudaMemcpy(dev_M, M , N * sizeof(int), cudaMemcpyHostToDevice), "memcpy 4");

even though [font=“Courier New”]sizeof(int) = 4*sizeof(bool)[/font].

I have changed that and it still produces the error.
HOWEVER, i have done extensive research and i THINK i know what is wrong.

In my kernel first kernel, i have a for-loop and then in the CPU code, i have cudaThreadSynchronize() because i need all threads to finish this before going on to the next kernel, however, is that the right way to synchroznize it? Also in this kernel, i have a conditional statement that will be evaluated differently for EVERY thread. Can this have any impact?

Solved. Thanks for input.