Openness about 'real' cubin instructions

Curious, I have never seen that before. Can you post an example of a .cu or .ptx file that generates this? Change line 277 of Disass.py to

               elif cmd == "consts" or cmd == "mem" or cmd == "sampler" or cmd == "const" or cmd == "reloc":

and it should ignore the section

The join instruction would be really handy if we could just hard code it in ptx…
If an assembler could be worked out, we could have a completely independent tool chain (if I’m allowed to release my front end). But I guess the assembler may need some more work after G92 comes out.

My boss would surely be mad at me if I post THAT ptx:(
Well, never mind, that’s not a big issue, I could just delete it anyway.

It seems to also have problem disassembling pass1 of my scan.

Yes, the next logical step from here is an assembler, so you can at least edit the code and recompile. Or experiment with new combinations of instructions.

What kind of front-end are you working on?

Edit: yes, just send it to my mail, or something that looks like it but has the same problem :)

Just a non-optimizing front end of a C++ like language for CUDA to address certain issues.
I started writing it when fed up with 0.8’s bugs. 1.0 compiler still turns out to be over-optimizing, and I continued it.

it appears dx10 constant buffer = constant memory. Both support 16 segments, although CUDA generally only uses 0 (global constant memory), 1 (kernel local constant memory) and 14 (relocations)

I am new to decuda. What is the ‘join’ instruction for?

Thank you.

It is a hint to help the hardware handle divergent branches. Placed before a point of divergence, it indicates where the control paths will merge again.

A typical if-then-else block will be compiled to:

# condition in p1

join endif		 # both path merge at endif

@p1.eq br else	 # if not p1 goto else

				   # here paths may diverge

  # code when p1 is true

br endif

else:

  #code when p1 is false

endif:

nop.join	 # merge control paths again

just adding my 2 cents to this really old thread.

As far as I know, Intel isn’t publishing their internal Microcode either - all that is published is the specification of the x86 instruction set and its extensions (SSE, MMX etc).

This is very much in line with what nVidia is doing with their PTX instruction set - while hiding the details about microcode that is in use by current graphics chip generations.

Christian