First of all after months of work I am able to run my iterations on GPU.
No Doubt the result is very impressive.
But when i go ahead and try to increase the number of iterations the kernel fails with statement “unspecified launch failure”.
and surprisingly sometime it launches kernel successfully for the same number of iterations.
I must say that my kernel program is very much bulky and it is also not conflicting with CUDA restrictions like registers etc.
I searched the forums here and i got no rigid answer.
Its again not a problem of XP watchdog as it fails in just few millisecond.
Please let me know if there is any means by which i can know what the exact reason why CUDA is behaving in such a unprofessional manner.
Yes, another possibility might be bad hardware. But I would be verifying the code first. Try something like valgrind or GPU ocelot if you can. Ocelot, in particular, is fantastic for isolating improper memory use.
Having said that, hardare can cause what you are seeing. I had one particular 9500GT DDR3 card that worked perfectly until you pushed it past about 75% of peak memory bandwidth, in which case it started behaving very erratically, including random launch failures, driver errors, video ram corruption. Even in standard OpenGL benchmarks it would running happily for hours, but my CUDA code could make it start failing in minutes. Emulation with valgrind, Ocelot, cuda-gdb never helped find a bug with the code, and I was able to run it happily on other hardware. At the suggestion of someone here, I tried underclocking, and it helped a bit, but in the end put it down to bad hardware and gave up on it.