Hi, All! Please give me answer on next question: how can get max value in vector (single dimension array) using parallel reduction?
I find kernel function for get sum of vector, but I can’t understand: how I can get max value?
I know about Thrust library and function thrust::max_element, BUT thrust function work with array which allocate on host.
In my case i have vector in global memory on GPU. Any suggestion?
Sorry for my bad english!:)