Need a program that guarantees a fatal crash to reset my card

Does anyone have a program that guarantees a fatal CUDA crash? (permanent freeze, blue screen of death, etc. whatever the 1.0 driver can’t recover from)
I just found the random crashes i mentioned in another topic won’t happen between a fatal crash and the next non-fatal crash.
Source code and winXP executable are both welcome.
please send to hqm03ster@gmail.com or reply this topic.

Just try to write a cuda version of a forkbomb, and I’m sure you’ll manage to freeze the blasted card. Since the code on the card is asynchronous to the CPU runtime, I don’t know if the computer will freeze along with the GPU. Make sure to have as many bank conflicts as possible to worsen your forkbomb! External Media

Well, I tried this :)
It indeed managed to freeze CUDA. But it didn’t reset the card :(
I guess one has to confuse the driver more to reset it…

Might as well just turn the power off as that is what you will have to do to recover from the freeze…

I already tried power off (interval varies from several seconds to a day) before posting this…

writing to random memory locations is always fun, and fast too, just adapt the parallel Mersenne twister for maximum crashing performance!

If that is the case I would suggest you must be referencing unintialised memory somewhere - go thru your code with a fine toothed comb.

I did that, several times, before I start posting stuff here. Trust me, I’m not the kind of people that ask such questions before making sure it’s not an initialization problem.
I have a memset after every alloc on both CPU and GPU. And the random behavior doesn’t change. The only thing I haven’t initializing is a PBO, which is NEVER read. And .bss should always be initialized in winXP.
I also tried running the program step-by-step to nail down the crash. But adding printf, gets or enough test code around a kernel launch would stop it from crashing. >_<
Basically, things are less likely to crash if extra code slow it down enough. That makes it sounds like temperature-related. But once it crashed when nvcpl.dll reports 58…

If you run in emulation mode do you still get errors of some kind? And I think by uninitialized memory, it was possible that osiris meant unallocated memory. ie you’re reading or writing past the end of the space you allocated. I’ve found that running in emulation mode with valgrind can help track down such cases.

That’s a useful suggestion, but not likely since I allocated everything quite conservatively… I’d try anyway.

My last emulation run was correct. Another one would take hours to complete, though:(

The new emulation run indeed succeeded.