How used my four gpu node

Hi azzulrd,

I have been reading about multi gpu programing, MPI+openacc, or selecting the device with openACC.

This is the recommended method (and easiest) for multi gpu programming. There are several online tutorials and talks if you need guidance on how to do this.

For example: http://on-demand.gputechconf.com/gtc/2017/presentation/S7546-jeff-larkin-multi-gpu-programming-with-openacc.pdf

Class #2: https://developer.nvidia.com/openacc-advanced-course

Alternatively, you can use OpenMP to create multiple CPU threads and then assign devices to each thread.

I have a question, How can I select one device from a GPU if I choose to set the device only in openACC - do not use MPI - ( I have 8 devices available in the whole cluster ).

You would call the OpenACC API’s “acc_set_device_num” routine.

Note you would want to use “acc_set_device_num” in MPI as well so that you can have different ranks use different devices.

Alternatively, you can use the environment variables “ACC_DEVICE_NUM” or “CUDA_VISIBLE_DEVICES” to set the device each rank uses.

Is there any specific configuration for the cluster, in order to get the 8 devices available instead of 2 by node ??

Each cluster management tools will have different ways of partitioning nodes so you’ll need to ask your system admin.

How should I exploit the 4 GPU nodes ?

I highly recommend you use MPI+OpenACC. It’s easiest way to program multi-gpus and allows your code to scale across multiple nodes.

-Mat