Access in kernel function memory in parallel or in sequential?

Hi people, I want to know if, in a global function runs by kernel<<<dimBlocks,dimThreads>>>(…);, the access in memory occurs in parallel or in sequential (maybe for the cause of bus???). Example:

__global__ void SumVect(...){

  ...

  VettRis[TID]=Vett1[TID]+Vett2[TID];

  ...

}

The threads works in parallel or in sequential because they must accede in the same global memory?

Most of the time access is parallel. It is serial on a compute capability 1.0 or 1.1 card if the accesses are not to a suitably aligned consecutive block of memory, or on any card if the addresses are to far away from each other to be accesses in a single transaction.

Check appendices F.3.2 and F.4.2 of the Programming Guide.