Hey,
I’ve searched around the forums for anything regarding this subject, but nothing turned up. Apologies if it has been posted before!
I have been writing code for physical simulations for a few years, and recently we have decided to port our code over to CUDA. The situation is basically that we have a 2D sheet on multiple separate cells with different properties, which all need a slightly different equation applied to the same variables. In simple pseudocode, something like:
for(x = 0, x < 1000, x ++){
for(y = 0, y < 1000, y ++){
if(a[x][y] = 1), b[x][y] = equation1
if(a[x][y] = 2), b[x][y] = equation2
}
}
Now, as I understand it, this can be easily implemented in CUDA by writing:
global void calc(double *a, double *b)
{
int idx = blockIdx.x * BlockDim.x + threadIdx.x;
if(a[idx] = 1), b[idx] = equation1
if(a[idx] = 2), b[idx] = equation2
}
However, I have heard that using conditionals withing a loop adversely affects performance. In the full code, we have around 10-11 different states rather than the two shown here. My question is, to what degree is performance effected by a statement like this? If conditions are going to reduce my speed by a matter of %, then I believe it will still be faster than moving arrays around in the main code before posting off to the GPU. However, if this will slow my kernel down considerably…
Thanks, and once again, apologies if this is a silly/newbie question.
Jon