question on inline PTX

Suppose we have something like

unsigned integer i
unsigned integer j
unsigned integer x[8] <- sits in local memory

How can we do


in inline PTX?

I don’t know the exact syntax for inline PTX, but you want to use the ‘st’ instruction with the correct modifiers: ‘st.local.u32’. I’d guess it’d be something like: ‘st.local.u32 x[%i], %j’

Hope that gets you on the right track. If not, the PTX spec is included in the ‘doc’ folder in the CUDA toolkit; you should be able to find the answer in there.

Your code may need to distinguish between sm_1x and sm_2x. For sm_1x, you want to use st.local as Jack points out, however for sm_2x you want to use a generic store pursuant to the section “Memory Space Conflicts” in the manual “Using Inline PTX Assembly in CUDA”.