cuda SASS question

Hi, I’m new to Cuda SASS code. I noticed that below code is the same for all of my kernels, what does it mean?

/*0000*/         MOV R1, c[0x1][0x100];                /* 0x2800440400005de4 */
        /*0008*/         S2R R0, SR_CTAID.X;                   /* 0x2c00000094001c04 */
        /*0010*/         S2R R3, SR_TID.X;                     /* 0x2c0000008400dc04 */
        /*0018*/         MOV R4, c[0x0][0x20];                 /* 0x2800400080011de4 */
        /*0020*/         ISUB R1, R1, 0x40;                    /* 0x4800c00100105d03 */
        /*0028*/         IMAD R8, R0, c[0x0][0x8], R3;         /* 0x2006400020021ca3 */
        /*0030*/         IMAD R22, R8, 0x4b, R4;               /* 0x2008c0012c859ca3 */
        /*0038*/         LOP.OR R4, R1, c[0x0][0x4];           /* 0x6800400010111c43 */

And what is below code trying to do?

/*0560*/         I2I.U32.U16 R5, R4;                   /* 0x1c00000010a15c04 */
        /*0568*/         I2I.U32.U16 R6, R3;                   /* 0x1c0000000ca19c04 */
        /*0570*/         ISETP.GT.U32.AND P0, PT, R5, R6, PT;  /* 0x1a0e00001851dc03 */
        /*0578*/     @P0 EXIT;                                 /* 0x80000000000001e7 */
        /*0580*/         IADD R0, R0, 0x1;                     /* 0x4800c00004001c03 */
        /*0588*/         I2I.U32.U16 R5, R3;                   /* 0x1c0000000ca15c04 */
        /*0590*/         I2I.U32.U16 R3, R4;                   /* 0x1c00000010a0dc04 */
        /*0598*/         ISETP.LT.AND P0, PT, R0, 0x20, PT;    /* 0x188ec0008001dc23 */
        /*05a0*/         ISETP.GE.U32.AND P0, PT, R3, R5, P0;  /* 0x1b0000001431dc03 */
        /*05a8*/     @P0 BRA 0x540;                            /* 0x4003fffe400001e7 */
        /*05b0*/         MOV32I R0, 0x1;                       /* 0x1800000004001de2 */
        /*05b8*/         ST.U8 [R23+0x8], R0;                  /* 0x9000000021701c05 */
        /*05c0*/         BRK;                                  /* 0xa800000000001de7 */
        /*05c8*/         EXIT;                                 /* 0x8000000000001de7 */

There’re some syntaxes that don’t have detail information in Nvidia docs.
Thanks for your help.

Such questions are easier to answer if you also show the corresponding source code. Even then it can be difficult to back-annotate SASS since the CUDA compiler optimizes aggressively. The first snippet looks like it could be part of an indexing computation based on thread index.

If you just need to know the meaning of each SASS instruction, NVIDIA provides one-line descriptions of most of them in the documentation but no details beyond that, e.g. “S2R” = “move from special register to general-purpose register”, “IMAD” = “integer multiply-add”, “I2I.U32.U16” = “integer conversion, from uint16_t to uint32_t”, etc.

The c references are constant memory locations, in the case above most likely the kernel arguments (I think bank 0 is used for kernel arguments on all recent GPU architectures, with other banks used for compiler-generated immediate data and programmer-provided constant data). “@Px” denotes a conditionally executed instruction, noting the predicate register that contains the condition.

thanks for your help, njuffa!
The second piece of code is SASS code that I dumped from a closed-source application and it’s sm20… just want to know what they have done inside the kernels.

You are just showing a snippet, so it is not possible to know what’s inside each register. The ISETPs are comparisons, PT is the predicate “true”. The second and third ISET are chained to implement a compound comparison:

p0 = (r0 < 32) && (r3 >= r5);

r0 appears to be some sort of loop counter since it gets incremented by 1 in every iteration of the loop (which isn’t shown in full; it starts at 0x540). The first ISET and BRA are presumably some sort of early out from the loop and in fact the entire kernel (since EXIT is used): if (r5 > r6) return;

Reverse engineering machine code without prior knowledge of what the code does is a very challenging assignment, on any platform. You can either spend the hours necessary yourself, or hire a highly-paid consultant to do it for you.

Single stepping through the SASS code using the nSight Eclipse or Visual Studio CUDA debugger might help in understanding the program flow.