Can a GPU Speed-up this Use-Case?

Task: With a MxN matrix, get every X combination of columns, then run analysis. For example if I want combos of X=2, it would be col1+col2, col1+col3, col2+col3, etc.
So for each col grouping, I slice from the matrix (keeping those cols of interest) and run conditional analysis. The analysis is basically: when col1 is in the 1st quartile and col2 is in the 3rd quartile, what is the result.

The MxN matrix is relatively small in terms of KB’s but with many columns and col-combos of 3 or 4, there are many, many permutations of the subsets. And cumulatively the subsets add up to become large files (tens of GBs).

So the use-case is a lot of slicing, then performing a rank of the subsets then aggregating.

I hired someone on upwork to code this and he got 0 speed up. I’m dubious and want to ask you guys.


Assuming the combos and resulting analyses are independent, it seems like it should be a reasonable fit:

  1. Large amount of independent work (“embarassingly parallel”)
  2. Lots of data reuse

Those are usually good indicators for moving work to the GPU.

The “aggregating” part I’m guessing is not independent, but that part of it looks separable based on your description, so running a final reduce on the GPU can also be done efficiently (usually).