Is CUDA suit for gradual calculating?

“gradual calculating” means if I want to calculate dataN,I must calc it from data[0…N-1],such as calculate d1,I

must calculate d0 firstly,calculate d2,must calculate d1&d0 firstly; so whether CUDA is suited for doing this

“gradual calculating” ? Is there any exmple or description in CUDA SDK refer to this matter?

It sounds like a basic prefix sum would be an example, as would recurrence relations.

example Prefix sum:

output[i] = input[i] + output[i-1]

at first glance, these seem to be “inherently sequential algorithms”. However these problems often can be accomplished in parallel.

There is a seminal paper by Kogge and Stone from the 70’s that covers recurrence relations.

For a treatment of a parallel prefix sum, try this:

There are many CUDA sample codes that provide examples of how to implement these, and there are also library functions in libraries such as thrust and CUB.

Thanks txbob,could you tell me where I download the samples?

if you install the cuda toolkit using the runfile install method, the samples will be installed also.