Factor "RSA like" 100 digits number

eddyall · April 28, 2013, 1:16am

We had a assignment in math where we used Maple to learn a tiny bit about algorithm and encryption and one of the bonus goal was to decrypt another’s student message by cracking his n, n being the product of two ~50 digits prime number, p and q.
Maple being poorly optimize and working in serial was nowhere close to be able to factor the 100 digits number in over 18h of computing.

Since I know that parallel computing is a better approach and that I have a GTX570, I thought I could kill 2 bird with one stone and learn a bit of CUDA and finding the Teacher n. (If you find this one you can find the n of every other students)

I’ll like to know how to translate this algorithm in CUDA and parallelize it or if there would be a better one as that one was the best I could do with the program I had.

p= 50 digits prime (not know)
q= 50 digits prime (not know)
n= p*q (know)

p= squareroot(n) lower to last odd Interger
q= n/p

while q is not an integer
p= p-2
q= n/q
return/print p and q

I have a bit of experience with Java(Netbeans) not C/C++ unfortunately.

Also would anyone have an estimate on how long it would take for one GTX570/i5-2500k?

njuffa · April 28, 2013, 3:26am

Let’s assume we represent the operands as multiple of 32-bit words (since the GPU has a 32-bit architecture). This means for the trial division we are looking at a 352-bit dividend. Based on the number of instructions used by CUDA’s 64-bit division (around 70), it seems that a 352-bit long-integer division could be accomplished with about 2000 instructions, about half of which would be integer multiplies or multiply-adds. The instruction throughput of a GTX570 for this instruction mix may be around 200e9 instructions per second. So we would be able to perform on the order of 1e8 trial divisions per second. Quickly eliminating multiples of 2, 3, and 5 as possible candidates for p leaves about 1e49 divisors to examine, so it would take 1e41 seconds to scan the entire divisor search space. How much calendar time this represents is left as an exercise for the reader :-)

Note that the above is an example of a back-of-the-envelope calculation using very rough estimates for all quantities involved.

tera · April 28, 2013, 12:44pm

Do you really need to perform a full division each time? Proceeding from one divisor to the next would just need an addition to apply the delta, and a few subtractions to test possible results.

Any further optimizations I’ll leave to experts in the field of cryptography.

njuffa · April 28, 2013, 2:23pm

I have no hands-on experience with factorization. Wikipedia provides a quick overview that could serve as the starting point for a literature search: [url]http://en.wikipedia.org/wiki/Integer_factorization[/url]

eddyall · April 28, 2013, 2:57pm

Didn’t think about modulo, thanks.

Just 3.17e41 years, still a lot faster than what I had.
As we already know that the number has only 2 factor of ~50 digits, I only need to search between 1e44 and n½, I don’t need to cheek 2, 3, 5…

p= 50 digits prime (not know)
q= 50 digits prime (not know)
n= p*q (know)

p= squareroot(n) lower to last odd Interger
q= n/p

while q =/= 0
{
p= p-2
q= n%q
}
q=n/q
return/print p and q

njuffa · April 28, 2013, 3:31pm

I did not mean to suggest trying division by 2, 3, or 5. I was suggesting that instead of iterating through all odd numbers (thus eliminating multiples of 2), use a slightly more complicated pattern of decrements to skip over multiples of 2, 3, and 5, which should reduce the number of candidates by about a factor of 10. To first order, there are 1e50 50-digit numbers, so this gets us down to 1e49 candidates.

eddyall · April 28, 2013, 8:11pm

If I can identify the digits of a number easily than yes it would probably speed this up a lot.

if (lastdigit == 5)
{}
else if (sumdigits%3 == 0)
{}
else
q = n%p

And concerning your previous post, can I use CUDA like netbeans or is it a lot more complicated?
Because I guess the IDE isn’t going to transform it in parallel for me.

seibert · April 28, 2013, 11:34pm

This is going to be a very hard first project with CUDA, especially if you don’t know C or C++. (NetBeans is an IDE and CUDA C/C++ is a programming language, so I have no idea how one would compare those two things.)

The CUDA compiler does not automatically parallelize code for you. You have to write special code to run on the GPU and code to run the CPU that launches the GPU code and collects the results.

I’m also confused as to what the ultimate goal is here. Are you trying to learn CUDA or factor a 100 digit number?

The main problem with your original approach is not the serial vs. parallel question, but the use of a very slow algorithm. If you want to factor a 100 digit number in a reasonable time, the state of the art is to use the one of the number field sieve algorithms. According to this page:

A 100 digit number can be factored in a matter of hours with a single CPU. NFS is not very easy, but I think is one of the few tractable approaches to factoring a number that large.

eddyall · April 29, 2013, 12:12am

I say CUDA because I don’t know the name of the IDE, I know it’s not the same thing but sorry for confusion.

Found a program on the wiki page and this may do the job, I’ll give it at less 3 hours. (And it seems it might do it)

Topic		Replies	Views
Parallelizing multiplication CUDA Programming and Performance	14	7580	October 11, 2008
CUDA vs Intel IPP in signal processing CUDA Programming and Performance	10	23406	May 22, 2024
CUDA version of visiting all subset combinations CUDA Programming and Performance	8	2775	July 9, 2013
Massive "simple" computation with CUDA CUDA Programming and Performance	14	8610	December 7, 2009
Speeding up genetic algorithms A clueless beginner looking for some advice :-) CUDA Programming and Performance	6	15129	March 21, 2011
A (not so) hypothetical question CUDA Programming and Performance	6	1649	March 24, 2009
CUDA Prime factorization CUDA Programming and Performance	1	1927	January 25, 2020
Ultimate Data Compression Algorithm PAQ on CUDA CUDA Programming and Performance	8	5360	December 9, 2008
Is this a task in which CUDA could speed up things? CUDA Programming and Performance	11	11778	September 29, 2009
Multiplication & remainders for large numbers. CUDA Programming and Performance	9	2498	August 4, 2016

Factor "RSA like" 100 digits number

Related topics