The problem is to calculate a recursive function in parallel. I’ve been playing around with Thrust, and it seems like it could almost do what I want if I could figure out how to combine a few of the examples, but I’m having trouble doing that. I’m not sure if Thrust will really be able to do what I want, or if I should try another library.

I want to calculate something like

x(t+1) = a(t+1) + b(t+1)*x(t)

for t=0 to n, and for many independent “people,” which could really be over several dimensions, but the important aspect is that each recursive formula is in it’s own “world” and could be computed in parallel. a() and b() could also be in parallel, or could be considered given. In addition, x(0) for each person could be considered given.

The problem I’m running into with Thrust is that it seems to have the ability to iterate over tuples with thrust::for_each, and has the ability to calculate cumulative sums with thrust::transform_inclusive_scan, but I don’t know how I can do both at the same time.

I’d like a solution where each thread could be given a number of vector arguments, and then perform a recursive calculation. Can I do that with Thrust, or would some other library be better? I considered doing each time step in Thrust and then iterating through time, but this poses an unnecessary restriction that each time step must be completed before the next one can start.