I am trying to use CUDA to solve a problem that involves a very large number of combinations. I have an array of integers numbered 0-47 that I pass from the CPU to GPU.
On the GPU, from that array of 48 integers I want to select every possible combination of 5 unique numbers. Each thread on the GPU should map to one specific combination. That is 1,712,304 combinations (or 1,712,304 threads). The order of the numbers is irrelevant, so I do not want duplicates. In other words, if I use the combination 0 1 2 3 4 5, then 0 1 2 3 5 4 should not be used because we already hit that case (see http://en.wikipedia.org/wiki/Combination).
To avoid repetition, I have visualized the combinations as strings of increasing numbers. Every number in the string must be greater than the previous number. So, for example, here are example combinations from the beginning to the end:
0 1 2 3 4 5
0 1 2 3 4 6
0 1 2 3 4 7
…
0 1 2 3 4 47
0 1 2 3 5 6
0 1 2 3 5 7
0 1 2 3 5 8
…
0 1 2 3 5 47
0 1 2 3 6 7
0 1 2 3 6 8
…
43 44 45 46 45
43 44 45 46 46
43 44 45 46 47
I have a GeForce 8800, so I want to use the maximum 512 threads per block to maximize performance. This would equate to roughly 3,344 blocks.
This is my first CUDA program. I know that every thread has a unique block ID and unique thread ID. The question I have is, how can I use the block and thread ID for a given thread to uniquely identify the specific combination that thread is supposed to use? In other words if a given thread is using combination 4 5 32 41 48, then no other thread should be using that combination and every single combination should be being used by a thread.
I have spent all night trying to tackle this problem. I came up with one solution, but I do not believe it is optimal. Any ideas? :)