How does 'LEA' instruction works?

sundycoder · March 11, 2020, 7:00am

Example:

LEA R10, P2, R193.reuse, c[0x0][0x1f8], 0x2;

What’s the operation of LEA instr?

SunilJB · March 11, 2020, 8:52am

Moving this to “CUDA Programming and Performance” forum so that CUDA team can take a look.

njuffa · March 11, 2020, 7:46pm

NVIDIA doesn’t document the details of the GPU machine instructions. I cannot find my reverse engineering notes for LEA right now. LEA is basically a left-shift-plus-add intended for 64-bit address computations. From memory (I will surely get something wrong here!), it is something like:

LEA d, a, b, c, s ===> d = ((a:c) << s) + b

where “:” denotes concatenation. To update a 64-bit pointer one would then use an LEA.cc instruction followed by an LEA.HI.X instruction to complete the 64-bit addition, because such a pointer requires two 32-bit GPU registers. In your example, it seems the pointer’s low part would be in c[0x0][0x1f8], c is not shown because it is RZ = 0 (?), and the shift factor is 2. So my best guess is

R10 = (R193 << 2) + c[0x0][0x1f8]

The P2 in your example should refer to a predicate register for use in predicated execution. cuobjdump --dump-sass normally doesn’t show predicates in that operand position, so I am not sure what to make of it. GPU machine instructions are architecture specific. What GPU architecture is the code shown in #0 for?

Topic		Replies	Views
How to understand the LEA assembly behind the cuda c++? CUDA Programming and Performance	3	885	June 25, 2023
cuda program stuck in LEA instruction CUDA Programming and Performance	0	596	March 16, 2018
Optimization to LD.64 missing? back-to-back LD instructions not coalesced automatically CUDA Programming and Performance	10	2320	June 30, 2012
About LD instruction for wmma CUDA Programming and Performance	2	530	July 5, 2023
Instruction meaning (sass) CUDA Programming and Performance	3	5453	June 10, 2020
Incorrect x86 instruction emitted by Emu nvcc Right-shift is broken! CUDA Programming and Performance	13	6223	August 18, 2008
Instruction categorization CUDA Programming and Performance	2	557	August 3, 2022
Bug with Pointer Arithmetic EmuDebug/EmuRelease results donâ€™t match debug/Release Results CUDA Programming and Performance	9	1512	August 4, 2009
[Solved]SASS Code Analysis CUDA Programming and Performance	5	8205	November 30, 2017
Maxwell (sm_50) instruction: LDG.E ? CUDA Programming and Performance	25	8547	August 15, 2015

How does 'LEA' instruction works?

Related topics