Assembler instructions on 80xx platform


I’ve got some questions which can (most likely) only be answered by someone of Nvidia. All are more or less assembler-related, so please don’t blame me for asking stuff that is not supported so far…

(1) add-with-carry

Is there an add-with-carry instruction on the 80xx platform?

Some thoughts about adc:

  • I guess there would be mnemonics to set and get the carry as well?!

  • Which instructions change the carry? What happens if the threat is stopped before we save the carry? Or is the carry a (thread-owned) register? Just curious…

(2) mulhi, mul, etc

What happens when we call mul und mulhi? Does it compute the multiplication twice or is there a way to tell the compiler to compute once and fill two output registers?

(3) PTX assembler

When can we expect a PTX manual and use inline assembler? AFAIK the FAQ says something like “soon”.



  1. The hardware does support integer add with carry, but I’m not sure if it’s exposed in PTX currently.

  2. There is mul.wide instruction that will compute 32 * 32 bits and produce a 64-bit result.

  3. The PTX specification and assembler will be included in the CUDA 1.0 release