Local vs Shared Memory execution slows down when using shared memory

I’d maybe be able to tell you something if you’d pasted both versions of your code.