Hi. I need to implement a global function (in CUDA C) that calls a serial function many times. The serial function is recursive and returns an array of 5 elements, I want the global function to return the same array for each thread. The code should be something like this:
struct matrix {
int rig;
int col;
float* N;
int* M;
};
__device__ float serial_func(struct matrix* A, struct matrix* B){
if(-----)
for (int j = 0; j < 5; j++) {
A->N[j] = B->N[j];
}
serial_func(A, B);
else
for (int j = 0; j < 5; j++) {
A->N[j] = B->N[j];
}
return 0;
}
__global__ void kernelFunc(?){
int i = threadIdx.x;
if(i<100)
serial_func();
}
How can I implement it? Which parameters do I have to insert inside the global function?
If you make the following compilable and matching your intent, I expect someone can help you with whatever remains:
struct matrix {
int rig;
int col;
float* N;
int* M;
};
float serial_func(struct matrix* A, struct matrix* B){
if(-----)
for (int j = 0; j < 5; j++) {
A->N[j] = B->N[j];
}
serial_func(A, B);
else
for (int j = 0; j < 5; j++) {
A->N[j] = B->N[j];
}
return 0;
}
void kernelFunc(int i, ?){
if(i<100)
serial_func();
}
int main(){
int max = 100;
for (int i = 0; i < max; i++) kernelFunc(i, ?);
}
And if you intend to have every thread work on the sameA and B, then this isn’t going to work. You can’t have multiple threads updating the same locations in memory:
Actually, I did it in C, but I have some problems in doing the same in CUDA C. The idea would be to create a global function where each thread calls the serial (device) function once. Also, it seems that the device functions don’t admit recursion.
Right, I imagined you had, and what I was suggesting was that you fill out the code I already provided to make it complete. If you want to do that, I’ll take another look. If not perhaps someone else will be able to help.
You can certainly do a recursive device function in CUDA, and there are multiple forum posts about it if you want to see an example. It does raise the issue of stack depth that needs to be carefully considered.