Hi,
I’m doing some test with the CUDA sample that we can find in this article Separate Compilation and Linking of CUDA C++ Device Code | NVIDIA Technical Blog
If I execute the code as it is on the article it works fine. But if I change the number of iterations in the main function it crash. The change is the following:
int main(int argc, char ** argv)
{
...
for(int i=0; i<500000; i++) // I have change the number iterations from 100 to 500000
{
float dt = (float)rand()/(float) RAND_MAX; // Random distance each step
advanceParticles<<< 1 + n/256, 256>>>(dt, devPArray, n);
cudaDeviceSynchronize();
}
...
The only change that I have done is the number of iterations from 100 to 500000. The impact of this change is that the device crash and I need to reset the workstation.
Then I have a question:
- Is there a kernel launch limit?
If there are not a limit, why the program crash?
Thank you.