Has anyone been able to efficiently implement reduction operations using CUDA ?
See the prefixsum scan examples in the SDK. Very easy to replace the add with a min/max etc.
Peter
Also the scalarProduct example.
Mark
Has anyone been able to efficiently implement reduction operations using CUDA ?
See the prefixsum scan examples in the SDK. Very easy to replace the add with a min/max etc.
Peter
Also the scalarProduct example.
Mark