how to limit the number sm in my program?

how to limit the number sm when I am computating ?I want to solve the big matrix using some gpu resource. because that i can do not thing when I was computating(the program occupancy the all resources (100%)).
I want to keep some resource of GPU for other application of my computer.

CUDA generally doesn’t have this capability exposed in the programming model. An ordinary CUDA program may fill the available resources on the GPU, and there are no limits provided for this.

THe exception is via CUDA MPS and a Volta or later GPU which you can read about in the CUDA MPS documentation (just google CUDA MPS). However, this doesn’t provide a general purpose restriction, for example it cannot partition the GPU between CUDA and graphics applications, for example.

Another GPU sharing option is NVIDIA vGPU (google that), but this doesn’t partition a GPU for a single user.

thank you ,and how to solve the problem that I get little stutters in my computer(such as using Browser) when I was computating

Use 2 GPUs, one dedicated for computing, one dedicated for graphics.

Or if you have control over the compute apps, make sure the compute apps you run only have kernel runtimes that are very short, say 100ms or less.

thank you very much!