__syncthreads thread syncronization

matrix multiplication in page 25 of programming guide 2.3