Dependency Problem in parallel threads it is related to numerical methods

I aim to solve ODEs by parallel processing because the iterative methods consume a lot of CPU time.

What I want to know is that can I access a thread from another ?
i.e Can I do a[idx+1]=a[idx]+something;

Say for example a fibonacci series, each element depends on the previous (2).
If all the threads are executed in parallel then this won’t be possible because to assign a value to a particular thread I need access to the previous one; and till this thread is updated the next thread should not execute . .

Is there a way to do this ?

The short answer, is no, you can’t do it. There isn’t any way of guaranteeing the executing order so that the sort of data dependcy could be satisfied without effecting serializing the kernel execution.

But I must say I am curious about the context of your question. People (myself included) are solving ODEs and PDEs using CUDA by a large number of different numerical methods. It isn’t immediately obvious to me where that sort of requirement you are asking about would crop up in real world solver algorithms.

Well, I have just begun solving ODEs and was writing codes for initial value problems based on Euler method or Runge Kutta order 4.
The for loop looks like this

for i=1:n
x(i+1) = x(i) + 1;
y(i+1) = y(i) + h*f(x(i),y(i));

This is a serial operation.
I am new to this, since you solve ODEs by parallel processing could you please help.

I think that you want to solve


— = f(x(t), t)


forward formula: x(t+h) = x(t) + h * f(x(t),t)

so index i in your code is time label, say x(i) = x( i * h )

however under such circumstance, you can not obtain value of future time before value of early time is solved

for exmaple, you cannot obtain x(5) until x(0),x(1), x(2), x(3), x(4) have been computed.

I think that parallelism should be focued on spatial domain, not time domain,

for exmaple, consider model problem


— = Ax where A comes from Laplacian, (1-D heat equation)


then A is sparse (tridiagonal), you can use GPU to accelerate computation of A * x