I wonder if there are any ways to stop cuda from consuming the GPU %100 during kernel execution,
just to give a little bit to the OS so other applications and general UI still behave/respond normally.
I know GPU’s don’t have OS kernels on them with finegrained multi-tasking but have people found solutions to work around this ?
maybe by configuring the app to say leave one multiprocessor free to the OS ?
Can’t be done. All you can do is either reduce the per kernel execution time (either reduce the work done per call or make the code faster), or use a dedicated compute GPU.
I do not believe this can be solved in software for current cards. The design of the GPU appears to require using all the multiprocessors for the same task.
Fermi hardware is supposed to lift this restriction, but NVIDIA has not said anything about dividing multiprocessors between the GUI and computation. They have only talked about dividing multiprocessors between different CUDA kernels in the same application.