In Open MP we have reduction operations,like reduction(+,array) . Do we have something like that in CUDA C as well ??
Thrust may do what you want.
[url=“Google Code Archive - Long-term storage for Google Code Project Hosting.”]Google Code Archive - Long-term storage for Google Code Project Hosting.