__syncthreads(); on tensor GPU

Hi,
I have code compiled with cuda 10 running in Maxwell GPU Amazon G3 instance successfully.
when running on tensor GPU amazon T3 instance it fails.

It fails with the __syncthreads() function.
the code looks like that
if()
{
//path a
:
:
__syncthreads();

:
__syncthreads();
:
}
else
{
//B path
:
:
__syncthreads();
__syncthreads();
:

}

I have the same number of __syncthreads(); on both A, B pathes
The reason I use __syncthreads(); is because of reuse of shared memory by all threads in block.
do you have any idea why it fails in “tensor GPU” ?
Do you have any solution?
I am running this code also on Pascal GPU on my PC (GTX 1080 )and it works fine
Thanks
Oren

syncthreads inside conditional code is often a problem. Its difficult to say if that is applicable here since you’ve shown almost no code.

If you have illegal use of syncthreads in conditional code, it may seem to run correctly on one GPU architecture but fail on another.

Run your code with the synccheck sub-tool of cuda-memcheck.

https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#synchronization-functions

Thanks for the answer
I understand from your answer that syncthread() is not allowed in conditional code, if the condition is not the same for all threads in the block.
Do you have a solution for syncing only the threads that go through the A path (in the conditional code)?
the thread in B path are not using the shared memory and they can continue running.

If (samecond) {

}
__syncthreads();
If (samecond) {

}
__syncthreads();