What is the good way to use MIG on a slurm cluster?

mikkelsen.kaare · March 18, 2021, 7:57pm

I apologize if this is the wrong subforum, it seemed to be one of the most likely at least…

Our HPC cluster (running slurm) was recently upgraded with a number of A100 cards, which we are now trying to get the most out of. That includes figuring out how to activate the ‘multiple instance GPU’ functionality. But, reading through NVIDIA Multi-Instance GPU User Guide :: NVIDIA Tesla Documentation, it seems there is an assumption of users with sudo rights?
If the admin has enabled MIG on each GPU, is it then possible for the users in their jobscripts to ‘activate’ 7 MIG 1g.5gb profiles, and then assign CUDA jobs to each profile?
right now, the closest we can get is first running a job with ‘nvidia-smi -L’ on the node , getting device ID’s (they look like ‘MIG-GPU-09156ffa-eece-6481-ce94-42ac07f27aa4/7/0’) and then running the ‘real’ jobscript with lines like

CUDA_VISIBLE_DEVICES=MIG-GPU-09156ffa-eece-6481-ce94-42ac07f27aa4/7/0 “CUDA job” &

but this seems like a very cumbersome workflow?

teejcee · April 13, 2021, 8:58am

I would like to know the same. We are about to purchase a few servers with A100 cards and enable MIG licensing as most of the expected workloads would not be able to utilize the full potential of the A100. We make use of SLURM and would like to use GRES to identify the MIG instance as a GRES. @mikkelsen.kaare , have you been able to find a solution to your problem and willing to share your findings?

mikkelsen.kaare · April 16, 2021, 8:11am

Hello @teejcee

The best we have come up with so far is a setup where the user switches either between ‘mig’ or ‘non mig’ use. The sys admin has defined ‘mig’ to be the max parrallel setup, with 7 devices. If a job is started with gres=gpu:mig, at the start of the jobscript, a call is made to nvidia-smi to get device IDs, and then the contents of a job array (defined in the jobscript) is spread across the 7 devices. This still means that the gpu is only serving a single user job, but it does make it possible to dynamically switch between max an min number of devices.

Also:
If your users are highly disciplined, slurm can be set to allow multiple jobs to run on the same node. If you use the ‘mig’ setup from above, and somehow coordinate which of the mig instances each user assigns tasks to, it is possible to have multiple users use different mig devices on simultaneously. However, this does not check whether the combined tasks exceed the memory of any given device, and seems to really just be a worse version of what slurm is supposed to do for us.

But, openpbs seem to already be cooking some mig compatibility into their system, so we’re hoping that slurm will be inspired to do the same :)
(https://openpbs.atlassian.net/wiki/spaces/PD/pages/2313453569/Nvidia+MIG+Support)

Topic		Replies	Views
MIG load balancing CUDA Setup and Installation	3	597	June 28, 2023
Running multiple users on 1 GPU CUDA Setup and Installation	2	1919	January 12, 2022
MIG's multi-Compute Instance (CI) Use case? CUDA Programming and Performance	2	440	November 10, 2020
CUDA MPS Not Working as Expected in Multi-GPU Environment CUDA Setup and Installation	4	282	November 12, 2024
MIGs do not show, despite being created CUDA Setup and Installation	2	226	October 15, 2024
manage jobs in multi-gpu system with compute exclusive mode or not CUDA Programming and Performance	14	4061	September 3, 2010
Sharing a GPU server for CUDA programming in a multi-user operating system CUDA Programming and Performance	4	18376	January 3, 2019
ISC20 Featured Demo: Running Multiple Workloads on a Single A100 GPU Technical Blog	0	291	November 15, 2021
ISC20 Featured Demo: Boosting Performance and Utilization with Multi-Instance GPU Technical Blog	0	267	August 21, 2022
Multi-user-systems und multi-gpu-usage CUDA Programming and Performance	9	6215	July 15, 2008

What is the good way to use MIG on a slurm cluster?

Related topics