This program makes a picture’s histogram (1024*768 for the picture). I run it on a GTS 320 Mo.
I try to modify threads and blocks values to observ the influence of blocks. I have these results :
I thought that the better time will be with 96 blocks (1 block per processor) but it seems it isn’t. Can somebody explains me if I am right or not ? If not, how the blocks/threads work ? (which values are the most effective in this case)