I have an application that launches a kernel on GPU1. Meanwhile work can be done on the CPU as it waits for the kernel to finish up.
When I add multiple GPUS in the same model, the CPU freezes up until all the kernels complete (cudaThreadsynchronize is called in each thread that manages a GPU). Don’t know why the host is not able to continue while all the GPUs assigned are off doing work. Quite literally the CPU system freezes up entirely (even the mouse pointer is locked).
It seems to be related to having multiple CUDA GPUs and trying to make use of them and the hosting CPU at the same time.