about multi GPU control

parody9c · December 20, 2019, 7:09pm

Hello !

I’am studying now to control multi GPU by CUDA.

But I have no idea how to control them.

I reference this site (https://github.com/zchee/cuda-sample/blob/master/0_Simple/simpleMultiGPU/simpleMultiGPU.cu), but there is a codes that used for.

Instead, that codes inspired me, so I use cuda + mpi to initialize device ID by rank and execute at the same time.

But My codes used only one GPU… :(

How can I control GPUs parallely ?

Here is my code

#include <stdio.h>
#include <curand.h>
#include <mpi.h>

#define XBLOCKSIZE 10
#define XGRIDSIZE 10

int main(int argc, char** argv)  {
        int mpi_error, rank, numtasks;
        mpi_error = MPI_Init(&argc, &argv);

        if(mpi_error != MPI_SUCCESS) {
                printf("MPI error has occured.\n");
                return 0;
        }

        MPI_Comm_size(MPI_COMM_WORLD, &numtasks);
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);

        //rank 0 print problem size
        if(rank == 0)
                printf("<<<%d,%d>>>\n", XGRIDSIZE, XBLOCKSIZE);

        int num_of_gpus;

        cudaGetDeviceCount(&num_of_gpus);

        printf("number of gpus : %d\n", num_of_gpus);
        printf("My rank : %d\n", rank);

        if(numtasks > num_of_gpus){
                printf("Too many process than number of gpus\n");
                return 1;
        }

        int devID = rank;

        cudaError_t error;
        cudaDeviceProp deviceProp;
        error = cudaGetDevice(&devID);

        if(error != cudaSuccess){
                printf("cudaGetDevice returned error %s\n", cudaGetErrorString(error));
        }

        error = cudaGetDeviceProperties(&deviceProp, devID);

        if(error != cudaSuccess){
                printf("cudaGetDeviceProperties returned error %s\n", cudaGetErrorString(error));
        }
        else{
                printf("GPU BUS ID : %d\n", deviceProp.pciBusID);
                printf("GPU Device ID : %d\n", deviceProp.pciDeviceID);
        }

        MPI_Finalize();
        return 0;
}

Results :
<<<10,10>>>
number of gpus : 2
My rank : 0
number of gpus : 2
My rank : 1
GPU BUS ID : 24
GPU Device ID : 0
GPU BUS ID : 24
GPU Device ID : 0

mnicely · December 21, 2019, 2:36pm

I think you’re making it more complicated than it needs to be. You can easily use OpenMP.

Set number of OpenMP threads to the number of GPUs you have or want to use.
Create a OpenMP parallel region.
Inside the region you need to get the OpenMP thread id.
In each OpenMP thread you must set the device. This is very important because it make the thread aware of the correct CUDA context.
You can use a struct to pass data to different threads.

omp_set_num_threads( numDevices );
#pragma omp parallel
  {
    int ompId = omp_get_thread_num( );

    // We must set the device in each thread
    // so the correct CUDA context is visible
    checkCudaErrors( cudaSetDevice( ompId ) );
    checkCudaErrors( cudaStreamCreate( &streams[ompId] ) );

    ...

    int offset { sizeData * ompId }
    void *args[] { &offset, &struct[ompId].data };

    checkCudaErrors(
        cudaLaunchKernel( reinterpret_cast<void *>( &doMath ), blocksPerGrid, threadPerBlock, args, 0, streams[ompId] ) );

  }

Here’s an additional link on OpenMP + CUDA https://devblogs.nvidia.com/cuda-pro-tip-always-set-current-device-avoid-multithreading-bugs/

Robert_Crovella · December 21, 2019, 4:08pm

Your MPI code is broken because you never set the device to the rank.

All threads/processes start out with an assumed device ID of zero. In order to change this, it is mandatory to make a call to cudaSetDevice().

Your code never does that.

You can fix it by making this change:

cudaDeviceProp deviceProp;
error = cudaSetDevice(devID);  // add this line
error = cudaGetDevice(&devID);

parody9c · December 23, 2019, 3:56pm

I appreciate about your reply quickly!!

I solve it :)

Topic		Replies	Views
CUDA+MPI = Unexplained Issues... Random Crashes, Errenous Output?!? CUDA Programming and Performance	5	3254	July 7, 2008
Using multiple GPUs Legacy PGI Compilers	7	22076	August 11, 2009
Multi-GPU MPI launch failing when UVM enabled Legacy PGI Compilers	5	3769	January 2, 2019
CUDA Fortran+Openmp problem Legacy PGI Compilers	9	1127	March 3, 2022
How to run these sample multi-gpu programs CUDA Programming and Performance	6	303	July 18, 2024
A little help with Multi-GPU example please :) How do I pass data to each GPU? CUDA Programming and Performance	8	28003	March 4, 2012
Question about CUDA+MPI Legacy PGI Compilers	3	2627	March 13, 2018
CUDA+MPI. Are they compattible in PGI Fortran? Legacy PGI Compilers	5	4733	June 30, 2011
Multiple GPU computing CUDA Programming and Performance	8	7878	May 7, 2008
problem with multi gpu using mpi Legacy PGI Compilers	2	2174	December 2, 2015

about multi GPU control

Related topics