How to use CUDA_VISIBLE_DEVICES for MIG instances

ryy19 · November 14, 2021, 6:28pm

Hi, there!
I am new to Multi-Instance GPU (MIG). I want to use MIG, the new feature of A100 to optimize my application. It uses MPI, so it includes codes like cudaSetDevice(rank%8). After cutting each of the original GPUs into two MIGs, I want to make the least change of my code, so I change the code above to cudaSetDevice(rank%16) and uses CUDA_VISIBLE_DEVICES={UUID of each MIG}. However, only the first MIG is found.
How can CUDA_VISIBLE_DEVICES apply to MIG instances? If there is not, what is the alternative way to use MIG+MPI?

Robert_Crovella · November 14, 2021, 9:41pm

ryy19 · November 15, 2021, 3:44am

Thanks for your reply. I know the instructions in the manual. I am just wondering that with CUDA_VISIBLE_DEVICES=MIG-aa,MIG-bb like env setups, the cudaSetDevice(1) will fail and the return value of cudaGetDeviceCount() equals to 1.
What can I do with this situation then?
My CUDA version is 11.4

Robert_Crovella · November 15, 2021, 5:02am

From the previously linked doc section:

CUDA can only enumerate a single compute instance

You may also wish to familiarize yourself with terminology and partitioning sections.

If you wish, you can create a multi-process application (perhaps for example using MPI) and assign one compute instance or GPU instance to each MPI rank, using a setting for CUDA_VISIBLE_DEVICES such that each MPI rank “sees” a different compute instance or GPU instance. In this way, each MPI rank will indeed see only a single CUDA enumerated device, and indeed each MPI rank will observe that

That is the idea. It may very well require changes to your application.

ryy19 · November 15, 2021, 6:35am

Much thanks, I’ve got your point!

system · November 29, 2021, 6:35am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How to use cudaSetDevice to select devices while using MIG on A100? CUDA Programming and Performance	1	2039	September 28, 2021
Understanding How CUDA_VISIBLE_DEVICES Works CUDA Programming and Performance	3	417	July 15, 2025
Set CUDA_VISIBLE_DEVICES to run kernels on specific MIG instance CUDA Programming and Performance	1	244	December 12, 2024
What is the meaning of the statement 'only enumeration of a single MIG instance is supported'? Deep Learning (Training & Inference) cuda	0	78	June 9, 2025
How to use cudaSetDevice to select devices while using MIG on A100? Compute Sanitizer a100	2	1156	September 27, 2021
Is supported multiple MIG instances with R550 & CUDA12.1? JAX cuda	2	151	June 5, 2025
cudaGetDeviceCount always return 1 on multiple MIG CIs env CUDA Programming and Performance	1	918	March 16, 2022
How to use cuda api programming to select MIG devices CUDA Programming and Performance	1	182	November 19, 2024
Proper way to call CUDA function within MPI code CUDA Programming and Performance	5	633	April 4, 2024
How to Select a Specific MIG Instance in the Kubernetes Pod GPU-Accelerated Libraries cuda	2	918	September 24, 2024

How to use CUDA_VISIBLE_DEVICES for MIG instances

Related topics