bad sector..?

Hello,

I am not even sure what to call this

Initially, I had a few threads read global memory in a kernel function

if (i < some_value)
{
shared[i] = global[y + i];
}

__syncthreads();

the code section read garbage (the shared memory contained garbage after the code section, even though a mouse-over in the debugger perspective shows correct values in the global memory array, with no race conditions present), and so I added this, to aid debugging

if (i < some_value)
{
shared[i] = global[y + i];
}

__syncthreads();

if (i == 0)
{
[read global]
}

__syncthreads();

Now, stepping the section, thread 0 would ‘forget that it is thread0’, and simply jump the added section
I have seen this before with poor synchronization - threads ceasing to execute logic properly, so I again added a section to further test whether the program by now has left its rails:

if (i < some_value)
{
shared[i] = global[y + i];
}

__syncthreads();

if (i == 0)
{
[read global]
}

__syncthreads();

if (i == 0)
{
[few lines of code]
}

__syncthreads();

Amazingly, thread 0 would simply jump the 1st (i == 0), and would subsequently execute the 2nd (i == 0)
The value of i does not change at any point, and mouse-over shows the correct value in i
At the same time, other threads of the same block seems to be in the right place

How is that possible, and what can I test (for)?

I have placed the kernel code in a shared library, but the other kernel functions seemingly execute fine

If I change the code within the skipped if section to not use additional local memory, but existing shared memory, it actually executes
So, I suppose I over-extended my reach for L1; I do not know

And as for the initial poor global memory read, I poorly sized the memory allocation