Hi,
In my program, each thread generate many values. I want to find the min value in all the values. Firstly, I want to save all the value into global memory and then compare them, but there are so many values that the global memory cannot contain them. I want to use atomic function atomicMin(), but it only support capability 1.0 and float type. Can you give me some advice?
And if you don’t have atomic support (device 1.0) you could write the per-block values out to global memory, then do a short CPU loop over them to find the minimum. Or, make a second kernel to do it on the device in a second pass.