I have four GPUs(RTX A6000), and I want to enable time-slicing on two of them but not on the other two. I checked the documentation related to GPU time-slicing but couldn’t find any information about it.
And when I did nvidia-smi -i compute-policy --set-timeslice -l command on terminal, all four GPUs says “default”. I think this means when I use GPU time-slicing on Kubernetes, all GPUs are applied to time-slicing. it’s not desirable to me.
That almost certainly cannot be done, but you can put the GPU in exclusive process mode.
GPU time-slicing refers to the ability of a GPU to support multiple host processes. The nvidia-smi options for this are default, short, medium, and long. On/Off is not an option.
But you can disable a GPU from supporting multiple host processes by setting the compute mode to exclusive process.
I want to build a development environment that allows me to use rich and poor GPU resources. If I apply time-slicing to all GPUs, the rich GPU resources will be unusable due to errors such as Out of memory when multiple users access the development environment.