Running multiple users on 1 GPU

Hi,

We are running a SLURM environment and we often see that GPU’s are used but not a 100%. For example running Jupyter notebooks. In the past there was MPS that supported multiple users on one GPU. Nowdays this seem only to work for 1 user per GPU. Without MPS it is possible to run multiple user programs but as long as not all memory is used. Containing memory seems then again only possible by the user itself in their program, but in a multiuser setup you want to do an ‘overall’ containment of memory so as an admin you are in control. Any thought on this by anyone?

Nvidia introduced MIG - Multi-Instance GPU, with the Ampere A100 and A30:

https://www.nvidia.com/en-gb/technologies/multi-instance-gpu/

https://docs.nvidia.com/datacenter/tesla/pdf/NVIDIA_MIG_User_Guide.pdf

Well that is nice if you are in a stage of buying cards. But we have NVIDIA cards at this time. I know in early days the MPS software allowed multiple users. So from a commercial aspect i can imagine that moving to A100’s is the way, but for our environment it is not.