Reconfiguration using NVIDIA MIG Manager For Kubernetes

kimmstop · June 12, 2025, 9:13pm

Hello,

I am currently interesting how to efficiently use “MIG” in a “Kubernetes” environment, and I have some questions regarding MIG instance reconfiguration.

According to the following documentation from NVIDIA:
NVIDIA MIG Manager For Kubernetes | NVIDIA NGC,
it appears that when using the MIG Manager, all workloads on a GPU must be stopped before the MIG configuration can be changed.

I’m wondering how this differs from an alternative method — where instead of using MIG Manager, we manually delete idle MIG instances and create new ones. For example, suppose I have a configuration with 4g.20gb + 2g.10gb instances, and the 4g.20gb instance is actively running a workload. If I want to reconfigure to 4g.20gb +1g.5gb + 1g.5gb, it seems that using MIG Manager would require terminating the workload on the 4g.20gb instance.

However, if I were to simply delete the idle 2g.10gb instance and manually create two 1g.5gb instances instead, would this cause any issues? Does this approach avoid the need to stop the running job?

Also, I’ve noticed that the Dynamic MIG feature mentioned in the NVIDIA Run:AI documentation(Version 2.19 -) has been deprecated. Since it seemed like a useful capability, I’d also like to understand why it was deprecated.

Thank you!

Topic		Replies	Views
Dynamically change GPU instances using MIG? CUDA Programming and Performance	3	181	September 4, 2024
Getting Kubernetes ready for the NVIDIA A100 GPU with Multi-Instance GPU Technical Blog	4	693	November 8, 2022
Using Multi-Instance GPU (MIG) with older versions of CUDA CUDA Setup and Installation	0	354	October 9, 2020
MIG-GPU Support in Kubernetes TAO Toolkit	9	575	June 26, 2022
How to Select a Specific MIG Instance in the Kubernetes Pod GPU-Accelerated Libraries cuda	2	572	September 24, 2024
GPU Operator features Docker and NVIDIA Docker	2	756	October 12, 2021
Failed to display compute instances: Unknown Error CUDA Programming and Performance	2	436	June 14, 2022
Adding MIG, Preinstalled Drivers, and More to NVIDIA GPU Operator Technical Blog	1	463	July 7, 2021
Torque + Maui + CUDA How to manage the available GPUs CUDA Programming and Performance	2	2313	December 23, 2010
Is Tesla K40 clashing with K20 in a multi-GPU system? Teaching and Curriculum Support	0	1249	June 10, 2014

Reconfiguration using NVIDIA MIG Manager For Kubernetes

Related topics