Hi, I’m pretty new to CUDA programming and I’m having a problem trying to port a part of Geant4 code into GPU.
Geant4 is a particle simulation tool based on c++ program.
The main reasons why we think it difficult is as following:
-
Geant4 simulation uses c++ instead of c programming.
-
Geant4’s program structure is a multi-level class ( In other words, it uses class calls class method to
complete the work } -
In order to let kernel function uses the class function we need to add Tag device host before each function. Could anyone gives some samrt ideas to add this Tag in front of every class’s member function ?
-
Geant4’s each class uses many class pointer variable. Hence, we need to use CUDA Unified Memory
Mechanism to handle it ( this is a rather difficult part)
And we have use a simple c++ program to test for it
#include <iostream>
#include <cuda_runtime.h>
using namespace std;
// Derived class
class Rectangle
{
public:
Rectangle()
{
}
int getArea()
{
return (*width * *height);
}
int* width;
int* height;
};
// Base class
class Shape
{
public:
Shape()
{
}
Rectangle* rect;
};
__global__ void change_width(Shape* sha)
{
*(sha->rect->width) = 210;
*(sha->rect->height) = 10;
return;
}
int main(void)
{
Shape* sha;
cudaMallocManaged(&sha,sizeof(Shape));
cudaMallocManaged(&sha->rect, sizeof(Rectangle));
cudaMallocManaged(&sha->rect->width,sizeof(int));
cudaMallocManaged(&sha->rect->height,sizeof(int));
*(sha->rect->width) = 20;
*(sha->rect->height) = 10;
change_width<<<1,1,0>>>(sha);
cudaDeviceSynchronize();
// Print the area of the object.
cout << "Total area: " << sha->rect->getArea() << endl;
return 0;
}
In this code, we can port a 2-level class structure into GPU. And it works.
But you know , Geant4 is a rather big project. I am afraid that our simple idea could not fulfill the job.
Could anyone with good experience in porting c++ code to GPU give some good advice to me??
I really don’t have idea how to handle for such a big program.
Thanks for all your help.
Sincerely,
KEVIN