Best way to find many minimums

Robert_Crovella · January 1, 2018, 4:52pm

It should be possible to read and minimize without doing any sort of parallel reduction sweep, until all the reading of input data is done by the block. Take a look at the code I indicated earlier. It appears that you are doing a full shared memory sweep each time you load a set of 64/128 inputs. This should not be necessary.