Accelerating loops where the current state is dependent on the previous state

I am working on accelerating on a compression algorithm where the current state of the variables in the loop is dependent on the previous state. Any ideas as to how to go about doing this?

Use the fastest serial processor you can get hold of, or reformulate your problem.

A simple prefix sum, depending on how you realize it in a serial approach, could fit your description: "the current state of the variables in the loop is dependent on the previous state. "

Y[i] = X[i] + Y[i-1] (i = 0…n-1)
Y[-1] = 0

However there exist parallel implementations that are interesting:

https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch39.html

(note that if your problem is actually a previx sum or similar, I don’t suggest you roll your own. These types of problems are already solved; I encourage you to use a high-quality library implementation.)

Other types of problems may fit in your general description. Recurrence relations are often expressed recursively, but in some cases there exist closed-form non-recursive solutions which may lend themselves to parallelization, or other “approximations” that can be arrived at by transforming the realization.

Here are 2 related examples:

  • naive closed form analysis:

https://stackoverflow.com/questions/34868437/sequential-operation-in-gpu-implementation/34868913#34868913

(the links from that answer are more useful than the answer itself)

  • approximation:

https://stackoverflow.com/a/56917718/1695960

In any event, the statement “the current state of the variables in the loop is dependent on the previous state” isn’t really enough to give general guidance that is highly productive. As is so often the case, when the question is expressed that way, the only non-tutorial type answer is “it depends”.

Thank you for your comments. I will look into your suggestions…