Forward looking GPU integer performance

What is a prime calculator? A factorization program? Have you looked at mfaktc for example:

Let me explain it more, we currently have a program to calculate prime numbers k-tuplets with cpu
However we would like to port this to GPU - cuda… I need help with that

Could you provide a pointer to your CPU reference code (e.g. GitHub)? What does your current CUDA port look like, and what did the CUDA profiler identify as a bottleneck in that code?