How to use GPU Operator with MIG to configure 2 GPUs on one node separately

zrams.pdk · September 10, 2024, 2:03am

I’m running the gpu operator in Openshift 4.14 and I have a gpu node with 2 A100 gpus. My strategy is set to “single” and I’m currently running with the label

nvidia.com/mig.config: all-1g.5gb

My goal is to have one gpu set to 1g.5gb for 7 partitions and have the other gpu set to a single large partition of 7g.40gb.

I tried setting the following labels:

nvidia.com/mig.config: all-enabled
nvidia.com/mig-7g.40gb.count: '1'
nvidia.com/mig-1g.5gb.count: '7'
and even though the mig manager will take this config and say it's "successful" it doesn't partition the gpus correctly. 
The labels show on the node but it's status is:
status:
capacity:
cpu: '256'
ephemeral-storage: 468250412Ki
hugepages-1Gi: '0'
hugepages-2Mi: '0'
memory: 527902464Ki
nvidia.com/gpu: '14'
nvidia.com/mig-1g.5gb: '0'
pods: '250'
allocatable:
cpu: 255500m
ephemeral-storage: '430465837161'
hugepages-1Gi: '0'
hugepages-2Mi: '0'
memory: 526751488Ki
nvidia.com/gpu: '14'
nvidia.com/mig-1g.5gb: '0'
pods: '250'

I’d appreciate any guidance I could get with this issue.
Thanks!

Topic		Replies	Views
Question on configuring MIG in openshift using nvidia-gpu-operator Linux	0	29	September 13, 2024
GPU Operator features Docker and NVIDIA Docker	2	737	October 12, 2021
Adding MIG, Preinstalled Drivers, and More to NVIDIA GPU Operator Technical Blog	1	444	July 7, 2021
Gpu-operator / MIG feature / ResourceQuota Container: CUDA kubernetes	0	779	June 5, 2023
GPU Passthrough + MIG (vGPUs) General Discussion	1	414	July 16, 2024
What is the good way to use MIG on a slurm cluster? CUDA Setup and Installation	2	3102	April 16, 2021
Dividing NVIDIA A30 GPUs and Conquering Multiple Workloads Technical Blog	0	343	August 30, 2022
Capacity to use multiple GPUs with gpu-operator Drivers - Linux, Windows, MacOS kubernetes	1	891	November 20, 2023
Need help with Kubernetes and GPU scheduling Docker and NVIDIA Docker	0	904	February 1, 2022
Docker doesn't detect MIG gpu devices DGX User Forum docker	7	3887	May 11, 2023

How to use GPU Operator with MIG to configure 2 GPUs on one node separately

Related topics