Top level design doubts, need help


I’m dealing with a 640x480 two images, I do overlapping Tiling on ImageC of size 38x100 and 20x20 on ImageR.
Then for each Tile 20x20 (from ImageR) I slide it (one pixel step) on one 38x100 tile and do dot product and save the result. and return until finish all Image.
I have thought of two top level design options doing that:

  1. divide the Image space between the blocks (each block responsible doing loading from global then do some calculation and save results back to global). or…
  2. divide the calculations between the blocks (each block responsible for part of the calculation for every pixel tile of the Image)

what is the best option ?
Or there is another option which I dont know?