How to do compute workload preemption on GTX 1080?

The white paper of GTX 1080 shows several preemption modes. How do I issue preemption requests to GTX 1080 card?

The preemption discussed in the GTX 1080 whitepaper is on the graphics side. There is no formal way in CUDA to issue a “preemption request”.

The closest you will come to that currently is CUDA stream priorities:

http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#stream-priorities

CUDA dynamic parallelism may also do preemption “behind the scenes” but there is no explicit control over that.

Does the P100 (not GTX 1080) provide any other user control over preemption? It definitely has some new control over CUDA kernels since the P100 whitepaper talks about preemption solving the problem of stalled applications waiting for long-running kernels to finish or time out. Perhaps the driver uses its own heuristics for such scheduling by time-slicing kernels that run more than, say, fifty milliseconds.