1024 bit or more integer data type to use in CUDA

Dear friends
I am working on implement of algorithm for generating minimal length addition chain for a big integer number (at least 1024 bit or more), what library can I use that support this datatype and also supported in cuda ???
thanks a lot

I’ve implemented 1024 bit integers using 32 int variables spread across one warp.

It allowed me to do big integer maths (modular exponentiation) with a nicely low register count.

Carry propagation on addition/subtraction is a bit tricky:

There are other libraries that are more similar to the well known GMP, like CUMP


thank you very much for fast reply , I already try to use CUMP but there is a lot of problems included that there is no documentation for it, and there is a lack of simple arithmetic functions like log and log for base 2 , I am new to CUDA, so I was think that there is a native library or native data type support the big integers, I will try to use your implementation, thanks a lot

one more thing, I need only a data type that can be handled in host and device, I dont need the arthimitec functions to be implementing on cuda (I mean I dont need parallel multiplication for example)

this library may be of interest: