Hello everyone,
I have an array of 5mil+ elements and I need to find the largest value.
at the moment I run a loop for every 256elements and always return the max of every 256 elements… then i do it over until i get the final one…
anyone have something better… maybe not including loops =P
any tips or pointers welcome
thanks
Just modify the CUDA reduction sample to compute the max instead of the sum.
UPDATE: you can also make use of the open sourced Thrust template library, it should have operators to do this - you don’t even have to program a CUDA kernel yourself.
Just modify the CUDA reduction sample to compute the max instead of the sum.
UPDATE: you can also make use of the open sourced Thrust template library, it should have operators to do this - you don’t even have to program a CUDA kernel yourself.
Hmm i was thinking about doing it with one of the reduction kernels…
Ill check out the Thrust libs… I guess that would be my chance to upgrade to cuda 2.2 aswell
thrust::max_element() should do what you want.