How many 2D crosscorrelations per second?

Hi, i’m totally new to GPU computing. Could someone give me a rough estimate how many 2D crosscorrelations i could achieve per second with an appropriate GPU and CUDA? I have around 100’000 templates of size 20x20 pixels per second each of which needs to be crosscorrelated with around 20 subimages of 20x20 pixels. The 100’000x20 correlations could theoretically all run in parallel, there are no dependencies…

Any ideas?

cheers.