Does anyone know information about anyone who has successfully implemented integer prime factorization on the GPU (with performance improvements over CPU)?

CUDA 1.0 has 64 bit integer support (from the changelog), but I’m talking about 1024+ bit integers. Arithmetic on 1024 bit integers can probably be emulated using 64 bit integers, but there probably wont be too much of a performance benefit. I’m scouring the internet for links on this topic. If anyone has some additional information, that would be useful.

