Strange performance behavior

Potentially both the ptx file and the ptxas output could be correct if ptxas has optimized away the local memory allocation. I don’t know however if it does optimizations on as high a level as that.