Where to find info on timeouts, etc?

I’ve looked through a lot of the CUDA docs (I think all that might be relevant) and don’t see anything that addresses the problem of dealing with timeouts, and related issues. Could someone point me in the right direction?

Generally, I’m working with code that will do intensive computations on a 3D grid. There’s enough work that it will probably run much longer than the watchdog timer limit (5 sec on Linux, I believe) even on fast hardware, so needs to be broken into sub-problems. But it will be run (by clients) on all sorts of GPU hardware, so I need to figure out chunk sizes to use to minimize overhead, but not get killed by watchdog. (And if it is killed, detect this, back up, and split the failed part into smaller chunks.)

It’d also be good to know things like e.g. how to make the GPU card into a pure compute device, letting the on-board graphics handle the display. But again, I don’t see much in the way of documentation.