GPU (semi-) diagnostic tool(s) CUDA Memtest - WOOO!

I just wanted to share my excitement after finding such a tool.
Here are some other names people might find it under:
cuda memtest

I searched through the forum and found SPWorley mention it a couple of times but that’s only after finding out about it elsewhere.

I had an issue on one of my 295 gpus and this quickly identified the same gpu that was giving me the problems.

We manage a few dozen gpus, so I’ve added this tool to my things to run each night on each just as a sanity check.

FIgured I’d spread the love in case someone didn’t know about it and also to see if anyone has any other recommendations for gpu diagnostics above the following post:

Ocelot is another amazing tool for testing and emulating CUDA codes on a CPU.

Among its features are detection of invalid memory accesses, race conditions, and deadlocks.