[SOLVED] Sample cuda program crash

Hi,

I’m doing some test with the CUDA sample that we can find in this article https://devblogs.nvidia.com/parallelforall/separate-compilation-linking-cuda-device-code/

If I execute the code as it is on the article it works fine. But if I change the number of iterations in the main function it crash. The change is the following:

int main(int argc, char ** argv)
{
        ...
        for(int i=0; i<500000; i++) // I have change the number iterations from 100 to 500000
        {
                float dt = (float)rand()/(float) RAND_MAX; // Random distance each step
                advanceParticles<<< 1 +  n/256, 256>>>(dt, devPArray, n);
                cudaDeviceSynchronize();
        }
        ...

The only change that I have done is the number of iterations from 100 to 500000. The impact of this change is that the device crash and I need to reset the workstation.

Then I have a question:

  • Is there a kernel launch limit?

If there are not a limit, why the program crash?

Thank you.

In the cross-posting:

http://stackoverflow.com/questions/41186537/sample-cuda-program-crash

OP has now reported this in the comments:

"Hi, I tried with texmode and the execution crash. I get the following message: [ 820.536748] NVRM: Xid (PCI:0000:01:00): 79, GPU has fallen off the bus. "

So it would appear to be a system or hardware problem.

Hi,

Finally I found the problem. It’s a temperature problem.