Hi folks. I’m having problem with a store instrucion, I’m running rodinia’s kmeans with a devcode to run different ptx compilations. The code below:
ld.param.u64 %r2, [_Z11kmeansPointPfiiiPiS_S_S0__param_4];
cvta.to.global.u64 %r3, %r2;
st.local.s64 [ocelot_ls_stack + 4], %r3;
.......
ld.local.s64 %r151, [ocelot_ls_stack + 4];
add.s64 %r132, %r151, %r131;
//add.s64 %r132, %r3, %r131;
When the ptx, in the devcode directory contains the “st.local” instruction, I get the error:
Cuda kernel kmeansPointexecution error: unspecified launch failure
It is just the st instruction, if I comment it, leave the ld and uncomment the last add, it doesn’t give an error, although I have to replace the loaded value with the value in %r3. Is not the array declaration, once other variables as being spilled and loaded from it.
ptxas compiles it, but using the binary file inside de devcode gives the same error.
Specs: NVCC 4.1, arch linux 64 bits
Is it a nvcc / cuda bug?
Thx in advance
Diogo Sampaio
Found the problem, st.64 must be 8 bytes aligned.