Can you be a bit more specific? Your question has a number of possible different answers, dependent on the level of detail/understanding you require.
In the absolute sense, the threads are obviously not executing concurrently. A threadblock is basically a virtual multiprocessor, but the real multiprocessors only have 8 streaming processors. Various pipelining considerations mean that groups of 32 threads (called a ‘warp’) will always appear to the programmer to run concurrently - if you have a race condition within a warp, it’s impossible to predict which thread will ‘win’ the race. However, the hardware scheduler makes no guarantees about the order in which warps within the same block are executed, unless a [font=“Courier New”]__syncthreads()[/font] (or similar) command is present. So unless your code uses those calls, then your program should treat all threads within a block as running concurrently. Even those commands only guarantee consistency at a single point in the code - which warp leaves the [font=“Courier New”]__syncthreads()[/font] first is not defined.