Necessary to index intermediate values in kernel?

Hello,

I am new to CUDA so apologies for the potentially simple question.

In a “normal” sequential code, assume I have some input array that gives an output array. To get the output array, I have to compute a lot of intermediate variables in a series of steps. If I need only 1 value of the input array to get 1 value of the output array, there is no real need to index the intermediate values or have them as array.

In a CUDA parallel code, if every single thread is handling a different index of the input array, then is it necessary to explicitly index with the thread ID also the intermediate variables before the output?

for example… assume y=f(x)

A sequential code…

x is a 500 element array…

for i=1:500
temp = x[i]
a = 2temp
b = a
5
c = b/2
d = c*10
y[i] = d
end

not indexing a,b,c,d intermediate variables is fine and there is no ambiguity in a sequential code.

In a parallel code, I would get rid of the for loop and i.e.

i=threadID.x

Then, I am not sure if this is OK as-is or if I need to specify explicitly the threadID in the intermediate a,b,c,d variables. I tried reading the Programming Guide but couldn’t really see a similar problem.

Thank you,
Kind Regards,

F

Every thing you declare in a kernel without any special attribute is local to the thread. So “int a” is different for each thread.

I would suggest reading the book “CUDA By Example”, by J. Sanders, E. Kandrot, see

https://developer.nvidia.com/content/cuda-example-introduction-general-purpose-gpu-programming-0

It will introduce you very smoothly by worked out examples to programming in CUDA language.