Global maximum of an image


I’ve got problems calculating the “global maximum” of an image for a normalization function. Somehow I didn’t find any solution which runs on a GPU in multi-threading mode.

Does someone have an idea how to do this efficiently?


Reduction is the key here: make each block take the maximum an area of the image, and store this. The next kernel invocation takes the maximum of these values. Repeat this until you end up with one value.