Enabling multiple GPUs

bmschopp · March 24, 2023, 10:58am

I am using a system containing two A4000 GPUs, however when I run the modulus examples only one of these is being used at a time. running the line:

torch.cuda.device_count()

in python within the docker container generates the output “2”, indicating both GPUs can be seen by pytorch (as far as I understand), but only GPU 0 is actually used during the running of the examples.

My question is what additional steps do I need to take to ensure that Modulus is able to run with both GPUs available on the system. The line I am using to start the container is:

docker run --gpus all --runtime nvidia --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -it -p 8888:8888 modulus:22.09

ngeneva · March 29, 2023, 3:47pm

Hi @bmschopp

You should be able to run Modulus in parallel using MPI and other typical parallel launching tools. For some information and example, have to look at the performance section of the docs or some of our examples that use multiple GPUs (e.g. Turbulent super resolution)

Topic		Replies	Views
Multiple Parallel GPUs CUDA Programming and Performance	1	2316	October 6, 2008
How can two containers share the usage of a GPU safely? CUDA Programming and Performance gpu	3	5181	July 3, 2023
mpirun on CUDA GPU CUDA Programming and Performance	0	2758	April 23, 2012
How to use multi-GPUs on a single mechine to run the cases in Modulus Technical Support (PhysicsNeMo Only)	7	1228	June 4, 2023
Unable to run on more than 1 GPU Report a Bug (PhysicsNeMo Only)	3	1260	October 12, 2022
Training Multiple Models in one GPU in linux Frameworks	0	642	November 3, 2022
GPU based cluster CUDA Programming and Performance	2	725	November 25, 2015
Device management between 8 cores and 4 GPUs CUDA Programming and Performance	0	451	August 23, 2011
CUDA using Multiple devices CUDA Programming and Performance	5	3212	June 22, 2009
Using multiple GPUs Legacy PGI Compilers	7	22091	August 11, 2009

Enabling multiple GPUs

Related topics