Integer carry chains

Hello,

I’m trying to add together some large integers (on the order of several hundred bits). I was planning on adding together the lowest 32 bits, then the next 32 bits plus the carry from the previous stage, and so on.

My only problem is that I cannot find the cuda C function that does this. Does it exist? I see that the PTX ISA has the addc instruction to handle this. Is my only option to code this part directly in PTX? From what I can tell, there is no way to “insert” PTX instructions inline with the C code, so it looks like there is no elegant way to use this approach.

I found this guy’s [mpi-inf.mpg.de] compiler add-on that should give me add with carry; is this the only option other than writing straight-up PTX code?

Hello,

I’m trying to add together some large integers (on the order of several hundred bits). I was planning on adding together the lowest 32 bits, then the next 32 bits plus the carry from the previous stage, and so on.

My only problem is that I cannot find the cuda C function that does this. Does it exist? I see that the PTX ISA has the addc instruction to handle this. Is my only option to code this part directly in PTX? From what I can tell, there is no way to “insert” PTX instructions inline with the C code, so it looks like there is no elegant way to use this approach.

I found this guy’s [mpi-inf.mpg.de] compiler add-on that should give me add with carry; is this the only option other than writing straight-up PTX code?

While not officially supported, the CUDA C compiler does support inline PTX assembly.

While not officially supported, the CUDA C compiler does support inline PTX assembly.

Thanks, I’ll give it a try.

Thanks, I’ll give it a try.