I’m right beginner in CUDA Programming. I want to make a Texas Hold’em Poker program in which I want to create an array of combinations. My problem is that the “2-out-of-52” is still fast on CPU but the “5-out-of-52” and the “7-out-of-52” has too many combinations to solve it fast on CPU. I want to work only with numbers from 0 to 51 and combine them.
Does anyone have an idea how could I solve it? My two main problem is that I don’t know how should I use the threadIdx and my GPU’s compute capability is only 1.1.
First off, what algorithm are you using? What’s it look like?
Second, threadIdx.x is the thread ID of the current block the kernel is executing. In CUDA, you execute in grids which are made of blocks which are made of threads.
For a single block, threadIdx.x gives you the thread ID of that thread in the block.
For multiple blocks, you index the thread like this :
const int tid = threadIdx.x + blockIdx.x * blockDim.x
My algorithm for the CPU used the next_permutation() function, but that’s what I cannot use in a kernel function… :(
It sounds interesting…I’m gonna try it out. Thanks!
Not the money is my problem, but these cards are very rare in Hungary…
It’s not so easy to get it because I live in a small town and I don’t have much time to solve this problem… :)
Keep in mind that there is a difference between a permutation and a combination. With a permutation order matters, while with a combination order does not matter.
In your case order does not matter in a given hand, so you would not need to use next_permutation,as it would be overkill.
I was saying that the implementation of algorithm would use the decomposition process to get the distinct members of the combination.
You also use a 64-bit unsigned long long type, go through all numbers from 0 to 2^52-1, count the bits and see if there is 7 bits set, and if there is then determine which hand that represents. That approach would be excessive though and take longer than the approach I first mentioned.