Big Integer Arithmetic Anyone trying to do bign ints on CUDA?

I’ve been playing big integers on my 8800GTX, with satisfying results. I can get 13k 512-bit Montgomery exponentiation operations per second, which is sufficient to do 6.5k private key decrypts (essentially SSL server handshakes) per second.

Has anyone else tried implementing big integer arithmetic? A big problem with CUDA as it stands is that PTX assembler doesn’t provide an add-with-carry instruction. I read somewhere that this functionality exists in the NV8 hardware, but just isn’t exposed; anyone know if this is true?

There is some info on addc instruction in the change log. May be it helps? Follow the link, please:

Yes, G80 has an addc instruction. It’s used if you add 64 bit integers. Too bad ptx doesn’t provide a way to insert them manually. You could use my assembler, but be careful it’s still very experimental :)