I am new to CUDA so apologies for the potentially simple question.
In a “normal” sequential code, assume I have some input array that gives an output array. To get the output array, I have to compute a lot of intermediate variables in a series of steps. If I need only 1 value of the input array to get 1 value of the output array, there is no real need to index the intermediate values or have them as array.
In a CUDA parallel code, if every single thread is handling a different index of the input array, then is it necessary to explicitly index with the thread ID also the intermediate variables before the output?
for example… assume y=f(x)
A sequential code…
x is a 500 element array…
temp = x[i]
a = 2temp
b = a5
c = b/2
d = c*10
y[i] = d
not indexing a,b,c,d intermediate variables is fine and there is no ambiguity in a sequential code.
In a parallel code, I would get rid of the for loop and i.e.
Then, I am not sure if this is OK as-is or if I need to specify explicitly the threadID in the intermediate a,b,c,d variables. I tried reading the Programming Guide but couldn’t really see a similar problem.