Without some magic, halting (via a breakpoint) a GPU that drives the Windows GUI via the WDDM driver will obviously effectively halt the system. The GUI cannot make any progress with the GPU stopped, and with the GUI the user control for the system, it becomes unresponsive to user input indefinitely, although it will continue to run.
I do not know how that software preemption feature works. It may be exactly the kind of magic that makes it possible to use the debugger with a single WDDM-controlled GPU these days, even when that WDDM device is driving the Windows desktop. Presumably the documentation gives guidance on how to set the various configuration settings for this case.
But since you already have a dual-GPU device, you would probably want to check how you can configure the driver so GPU 1 drives the display, leaving GPU 0 available for CUDA applications, allowing you to use all your programs as-is.
To resolve the issues with a long-running kernel, I would suggest running with cuda-memcheck before diving in with the debugger. cuda-memcheck can diagnose many issues with out-of-bounds accesses, race confiditons, and incorrect API arguments.